Added a multithreading infrastructure that should be independent of multithreading implementation in the future.
Currently, gemm blocked variants 1f and 2f, and packm variant blocked variant 1 is parallelized.
Details:
- Modified build system (mostly configure and top-level Makefile) so that
a user can build a BLIS library outside of the top-level directory of
the source distribution.
- Added "test" target to Makefile so that the user can run "make test",
which will compile, link, and run the testsuite binary. This works even
if the build directory is externally located, thanks to the test suite
binary's new -g and -o command-line options. Also, when creating the
test suite via the top-level Makefile, the linking is against the
local archive, in lib/<configname>, rather than at <install_prefix>/lib.
- Modified testsuite/Makefile so that it links against the library built
locally, in ../lib/<configname>.
- Added "-lm" to LDFLAGS of most configurations' make_defs.mk.
- Various other cleanups to build system.
Details:
- Changed top-level Makefile so that headers are installed to
$(INSTALL_PREFIX)/include/blis/. (Header directories are no longer
named by version/configuration and then symlinked.)
- Added uninstall targets, including uninstall-old to clean out old
library archives.
- Added GREP makefile definitions to all configurations' make_defs.mk.
Details:
- Added support for C99 complex types to bli_type_defs.h and overloaded
complex arithmetic to the scalar-level macros in include/level0. This
includes a somewhat substantial reorganization and re-layering of much
of the existing machinery present in the level0 macros.
- Added new #define for BLIS_ENABLE_C99_COMPLEX to bli_config.h files,
commented-out by default, which optionally enables the use of built-in
C99 complex types and arithmetic.
- Minor changes to clarksville and reference configs' make_defs.mk files.
- Removed macro definitions from bli_param_macro_defs.h which was not being
used (bli_proj_dt_to_real_if_imag_eq0).
Details:
- Added a new "special" directory type: any source code within directories
named "kernels" will be compiled with a separate CFLAGS_KERNELS set of
compiler flags. This allows the developer to specify a separate set of
flags (e.g. optimization flags) for compiling kernels while maintaining a
standard set for regular framework code.
- Fixed a bug in the top-level Makefile that was causing "noopt" code
to be compiled with the standard set of compilation flags.
- Updated make_defs.mk in reference, flame, and clarksville configurations
according to above changes.
Details:
- Removed the #define of _POSIX_C_SOURCE in bli_config.h (for both reference
and clarksville configurations) and added "-D_POSIX_C_SOURCE=200112L" to
the compiler command line arguments in make_defs.mk (for both configs).
Thanks to Devin Matthews for suggesting this change.
Details:
- Added a new set of reference gemm, gemmtrsm, and trsm micro-kernels that
contain explicit loops over MR and NR, thus allowing them to be used
unmodified by developers who want to build a reference library with
custom register blocksizes.
- Changed config/reference/bli_kernel.h to use above ukernels by default.
- Changed interfaces of new and existing gemm, gemmtrsm, and trsm micro-kernels
to use 'restrict' keyword.
- Added -funroll-loops option to config/reference/make_defs.mk.
- Updated comments in bli_kernel.h describing constraints on register and
cache blocksizes.
- Updated _adds_mxn.h, _copys_mxn.h, and _xpbys_mxn.h macros files so that
single-char macros are also defined.
Details:
- Updated CHANGELOG.
- Added 'changelog' target to Makefile that runs 'git log --decorate' and
overwrites CHANGELOG with the output.
- Other trivial changes.
Details:
- Added/renamed packm, unpackm kernels.
- Added machine value routines.
- Added param_map facility.
- Renamed AUTHORS to CREDITS.
- Added Makefile; continued to expand upon existing configure script.
- #define fuse_fac macros in operation headers if not defined already
(by the user in bl2_kernels.h).