Devin Matthews
49f85177f8
KNL ukernel compiles with gcc.
2016-04-18 10:14:11 -05:00
Devin Matthews
58b2c3cf04
Rewrite of KNL kernel in GNU extended asm syntax.
2016-04-16 16:12:24 -05:00
Devin Matthews
dd856c2cb7
Translated MIC kernel to KNL and cleaned up a bit. Only real change is lack of swizzle modifiers for FMA instructions (used bcast from memory instead).
2016-04-11 10:39:18 -05:00
Devin Matthews
7f27431d3f
Copy mic kernel to knl for transliteration.
2016-04-08 10:04:39 -05:00
Devin Matthews
f8f02f0334
Merge branch 'master' into const_correctness
2016-04-06 11:37:05 -05:00
Devin Matthews
32c92d945c
Merge branch 'master' into const_correctness
2016-04-06 11:36:02 -05:00
Field G. Van Zee
d1f8e5d9b2
Merge pull request #60 from esauvage/master
...
sgemm µkernel for bulldozer : bug correction for k%4 != 0
2016-04-05 12:21:27 -05:00
Etienne Sauvage
c11d28eed8
cgemm µkernel for bulldozer : bug correction for k%4 != 0
2016-04-02 21:15:48 +02:00
Field G. Van Zee
20af937b57
Merge pull request #59 from devinamatthews/fix_testsuite_makefile
...
Fix testsuite makefile
2016-03-31 14:37:30 -05:00
Devin Matthews
fc61a1143e
Fix formatting in configure.
2016-03-31 10:53:01 -05:00
Devin Matthews
26379b14de
Adjust paths in common.mk to support building from testsuite dir.
2016-03-31 10:45:48 -05:00
Field G. Van Zee
36c3abb05f
Merge pull request #58 from esauvage/master
...
cgemm & zgemm micro-kernels for FMA4 instruction set (bulldozer confi…
2016-03-31 10:26:17 -05:00
Devin Matthews
356d854fc9
Make symlink to common.mk in build directory.
2016-03-30 16:33:15 -05:00
Devin Matthews
edbb847004
Refactor out some definitions which moved from make_defs.mk to Makefile for use in testsuite Makefile.
2016-03-30 16:27:11 -05:00
Etienne Sauvage
917ce75482
cgemm & zgemm micro-kernels for FMA4 instruction set (bulldozer configuration), based on x86_64/avx micro-kernel
2016-03-30 22:03:09 +02:00
Devin Matthews
62914ccbcd
Merge branch 'master' into const_correctness
2016-03-29 15:24:25 -05:00
Field G. Van Zee
64b41fa554
Merge pull request #54 from devinamatthews/more_config_opts
...
More config opts
2016-03-29 15:19:41 -05:00
Field G. Van Zee
1b09e343df
Updated gcc version from 4.8 to 4.9 in .travis.yml.
2016-03-29 12:55:28 -05:00
Devin Matthews
0171ad5899
Add icc and clang support for Intel architectures, fixes #47 . 2bd036f fixes #49 BTW.
2016-03-28 13:55:06 -05:00
Field G. Van Zee
3090fff64c
Merge pull request #44 from esauvage/master
...
sgemm micro-kernel for FMA4 instruction set
2016-03-28 12:36:25 -05:00
Devin Matthews
e6e566426a
Merge branch 'master' into more_config_opts
2016-03-26 14:10:15 -05:00
Field G. Van Zee
8624e36543
Merge pull request #50 from devinamatthews/fix_noopt_avx
...
Fix configuration issue where instruction set flags are not specified for debug builds.
2016-03-26 13:56:28 -05:00
Devin Matthews
469429ec34
Fix LD_FLAGS -> LDFLAGS.
2016-03-25 20:45:41 -05:00
Devin Matthews
8442d65c9e
Replace -march=native with specific architecture flags to support cross-compiling, and add icc support for Intel architectures.
2016-03-25 20:06:48 -05:00
Devin Matthews
76099f20be
Add threading option to configure.
2016-03-25 17:22:58 -05:00
Devin Matthews
ad43eab4c7
Merge branch 'fix_noopt_avx' into more_config_opts
2016-03-25 15:00:02 -05:00
Devin Matthews
9452bdb3af
Add options for verbose make output and static/shared linking to configure.
2016-03-25 14:59:50 -05:00
Devin Matthews
2bd036f1f9
Fix configuration issue where instruction set flags are not specified for debug builds.
2016-03-25 12:16:49 -05:00
Devin Matthews
bbf704bf75
Add missing const to bli_read_nway_from_env.
2016-03-25 09:55:35 -05:00
Field G. Van Zee
a315833f06
Merge pull request #48 from figual/master
...
Updated and improved ARMv8 micro-kernels.
2016-03-24 12:30:21 -05:00
figual
af92773f4f
Updated and improved ARMv8 micro-kernels.
2016-03-23 22:07:02 +01:00
Devin Matthews
a4d7729776
Set default value for debug_type variable.
2016-03-21 09:55:21 -05:00
Devin Matthews
0e2447fa55
Add const correctness to auxinfo_t struct (microkernels need update theoretically).
2016-03-17 16:32:05 -05:00
Field G. Van Zee
1d1a426d18
Merge pull request #46 from devinamatthews/new-config-opts
...
Add several changes to the build system.
2016-03-07 15:17:53 -06:00
Devin Matthews
d226dfa051
Add several changes to the build system.
...
1) Add -- options.
2) Add -d/--enable-debug option to enable debugging symbols with and without optimization.
3) Allow user to specify CC at configure time, and determine vendor (gcc/icc/etc.). For now configurations enforce a particular vendor.
4) Add make V=[0,1] option to control build verbosity.
2016-03-05 16:18:14 -06:00
Field G. Van Zee
5a978fffdb
Merge pull request #45 from devinamatthews/high_prec_timers
...
Use clock_gettime(CLOCK_MONOTONIC) and mach_absolute_time instead of gettimeofday
2016-03-04 17:26:58 -06:00
Devin Matthews
63e2642390
Make sure that -lrt is linked on Linux.
2016-03-04 13:17:50 -06:00
Devin Matthews
44fddd48dc
Add missing \.
2016-03-04 12:36:38 -06:00
Devin Matthews
7cabd2131f
Use clock_gettime(CLOCK_MONOTONIC) and mach_absolute_time instead of gettimeofday.
2016-03-03 11:43:07 -06:00
Tyler Smith
adb2b4e096
Fixing guard for non implemented partitioning through packed matrices
2016-03-02 14:48:12 -06:00
Etienne Sauvage
4ca5d5b1fd
sgemm micro-kernel for FMA4 instruction set (bulldozer configuration), based on x86_64/avx micro-kernel
2016-03-01 21:33:01 +01:00
Etienne Sauvage
627d59b5ba
symbolic link for bulldozer configuration to kernels
baseline
2016-02-29 21:53:12 +01:00
Field G. Van Zee
2dc5c0ae03
Merge pull request #40 from tkelman/bulldozer-symlink
...
Add symlink from config/bulldozer/kernels to kernels/x86_64/bulldozer
2016-02-29 12:22:51 -06:00
Field G. Van Zee
f2809fc5f7
Merge pull request #39 from devinamatthews/fix_f2c_conflicts
...
Devin's f2c type namespace update.
Details:
- Added "bla_" prefix to f2c type names to prevent conflicts with external user code.
- Removed most of the body of bli_f2c.h, which was unused.
2016-02-27 13:06:03 -06:00
Tony Kelman
3d0fae810d
Add symlink from config/bulldozer/kernels to kernels/x86_64/bulldozer
...
to fix linking issue mentioned in #37 and https://groups.google.com/forum/#!topic/blis-devel/iypwljcaeEI
2016-02-25 23:24:03 -08:00
Devin Matthews
8624a33ccc
Fix remaining f2c conflicts.
2016-02-25 13:51:26 -06:00
Devin Matthews
372eef0b6c
Fixed most conflicts after hack-n-slash ofr bli_f2c.h, cleanup in
...
progress.
2016-02-25 12:01:58 -06:00
Field G. Van Zee
f86b94f206
Included missing blas2blis integer def to CBLAS.
...
Details:
- Added #include "bli_config_macro_defs" to all cblas_*.c files in
compat/cblas/src. This has the effect of defining
BLIS_BLAS2BLIS_INT_TYPE_SIZE to the default value if bli_config.h does
not define it. Thanks to Tony Kelman for reporting this bug.
- In cblas_i?amax.c, changed the type of the variable 'iamax' from 'int'
to 'f77_int'. This eliminates a compiler warning and a potential
runtime bug and/or crash when the size of an int differs from the size
of f77_int (as determined by BLIS_BLAS2BLIS_INT_TYPE_SIZE).
2016-02-23 18:12:34 -06:00
Field G. Van Zee
0b126de134
Consolidated packm_blk_var1 and packm_blk_var2.
...
Details:
- Consolidated the two blocked variants for packm into a single
implementation (packm_blk_var1) and removed the other variant.
- Updated all induced method _cntl_init() functions in frame/cntl/ind/
to use the new blocked variant 1.
- Defined two new macros, bli_is_ind_packed() and bli_is_nat_packed(),
to detect pack_t schemas for induced methods and native execution,
respectively.
2015-11-13 16:29:12 -06:00
Field G. Van Zee
30e5eb29e0
Minor changes to treatment of rs, cs in bli_obj.c.
...
Details:
- Applied a patch submitted by Devin Matthews that:
- implements subtle changes to handling of somewhat unusual cases of
row and column strides to accommodate certail tensor cases, which
includes adding dimension parameters to _is_col_tilted() and
_is_row_tilted() macros,
- simplifies how buffers are sized when requested BLIS-allocated
objects,
- re-consolidates bli_adjust_strides_*() into one function, and
- defines 'restrict' keyword as a "nothing" macro for C++ and pre-C99
environments.
2015-11-13 12:14:19 -06:00