Commit Graph

78 Commits

Author SHA1 Message Date
Field G. Van Zee
2c7960c841 Implemented ARG_MAX hack in configure, Makefile.
Details:
- Added support for --enable-arg-max-hack to configure, which will
  change the behavior of make when building BLIS so that rather than
  invoke the archiver/linker with all of the object files as command
  line arguments, those object files are echoed to a temporary file
  and then the archiver/linker is fed that temporary file via the @
  notation. An example of this can be found in the GNU make docs at
  https://www.gnu.org/software/make/manual/make.html#File-Function
- Thanks to Isuru Fernando for prompting this feature.
2018-07-05 14:38:33 -05:00
Field G. Van Zee
195480beb5 Merge branch 'master' into dev 2018-06-25 13:24:21 -05:00
Field G. Van Zee
884175d9ff Added configure support for preset CFLAGS, LDFLAGS.
Details:
- Any preexisting values set to the CFLAGS environment variable (or the
  CFLAGS variable if given on the command line) are saved by configure
  for later inclusion (prepending, to be precise) along with the
  compiler flags automatically determined by the BLIS build system.
  LDFLAGS is treated in a similar manner.) Thanks to Dave Love for
  requesting this feature in issue #223 and Mathieu Poumeyrol for his
  support on this and a previous related issue.
- Comment updates to build/config.mk.in.
- Strip whitespace from return value of various cflags functions in
  common.mk.
2018-06-22 18:08:43 -05:00
Field G. Van Zee
3f48c38164 Cosmetic fix to configure output in config.mk.
Details:
- Fixed configure so that MK_ENABLE_MEMKIND is assigned "no" when the
  option is disabled due to libmemkind not being present. This wasn't
  affecting anything since the one use of the variable (in common.mk)
  was formulated as "ifeq ($(MK_ENABLE_MEMKIND),yes)". That is, the
  variable being empty was effectively equivalent to it being set to
  "no".
- Comment updates to build/config.mk.in, common.mk.
2018-06-05 16:52:35 -05:00
Field G. Van Zee
22deef2f54 Support alternative gemm implementation sandboxes.
Detail:
- configure:
  - add support for --enable-sandbox=NAME to configure script, where NAME
    is a subdirectory of a new 'sandbox' directory that contains an
    alternative implementation of gemm. (For now, only implementations of
    gemm may be provided via a sandbox.);
  - add support for C++ compiler. C++ compilers are handled in a manner
    similar to that of C compilers, in that a default search order is
    used, and that CXX is searched for first, if the variable is set. In
    practice, the C++ compiler that is selected should correspond to the
    selected C compiler. (Example: If gcc is selected for C, g++ should
    be selected for C++.) The result of the search is output to config.mk
    via build/config.mk.in. NOTE: The use of C++ in BLIS is still
    hypothetical, but may eventually move to being experimental. This
    support was intended only for use of C++ within a gemm sandbox.
- build/config.mk.in:
  - define SANDBOX variable containing sandbox subdirectory name.
- build/bli_config.in:
  - define either of the BLIS_ENABLE_SANDBOX or BLIS_DISABLE_SANDBOX
    macros in bli_config.h.
- common.mk:
  - include makefile fragments that were propagated into the specified
    sandbox subdirectory;
  - generate different CFLAGS for sandboxes, as well as a separate
    CXXFLAGS variable for sandboxes when C++ source files are compiled;
  - isolate into a single location lists of file suffixes for various
    purposes.
  - reorganized/clean up code related to identifying header files and
    paths.
- Makefile:
  - generate object filepaths for and compile source code files found in
    sandbox sub-directory;
  - remove makefile fragments placed in sandbox sub-directory (cleanmk);
  - various other cleanups.
- Added .cc, .cpp, and .cxx to list of suffixes of files to recognize in
  makefile fragments (via build/gen-make-frags/suffix_list).
- Updated blis.h to conditionally #include bli_sandbox.h (via a new file,
  bli_sbox.h), which each sandbox is assumed to use for any type
  definitions and function prototypes it wishes to export out to blis.h.
- Conditionally disable bli_gemmnat() implementation in frame/3 when
  BLIS_ENABLE_SANDBOX is defined.
2018-05-24 14:28:55 -05:00
Field G. Van Zee
216a4cb9cb Minor update to flatten-headers.[py|sh] help text.
Details:
- Fixed a typo and removed some outdated language from the help text of
  flatten-headers.py and flatten-headers.sh.
2018-05-18 18:47:03 -05:00
Field G. Van Zee
ad67dc4e34 Communicate cc, cc_vendor to make via config.mk.
Details:
- Historically, the compiler selection has happened statically in the
  various make_defs.mk and would only be overriden by setting CC (either
  prior to running configure or as a configure argument). However, in
  the last couple months, configure has evolved to contain rather
  sophisticated compiler detection logic for the purposes of blacklisting
  sub-configurations. It only makes sense that configure now fully take
  over the responsibility of selecting a compiler from the GNU make side
  of the build system. Thanks to Alex Arslan for his help exposing this
  issue.
- Substitute found_cc into CC in config.mk via configure.
- Set a new variable, CC_VENDOR, in config.mk via substitution from
  configure, and disable the corresponding CC_VENDOR code in common.mk.
- Disabled default compiler selection (usually gcc) in the sub-configs'
  various make_def.mk files.
2018-05-14 18:35:28 -05:00
Field G. Van Zee
af1d8470b5 Better handling of shared libraries on OS X.
Details:
- Use the .dylib shared library suffix on OS X (instead of .so in Linux).
- Link with the -dynamiclib and -install_name options on OS X (instead of
  -shared and -soname in Linux).
- Determine operating system (e.g. Linux, Darwin) during configure and
  substitute into config.mk.in rather than run 'uname -s' during make.
- Echo operating system during configure.
2018-05-11 17:49:58 -05:00
Field G. Van Zee
b699bb1ff0 Adopt Linux-like .so versioning at install-time.
Details:
- Changed the naming conventions used for installed libraries and
  symlinks to more closely mirror patterns used by typical GNU/Linux
  libraries. Whereas previously static and shared libraries were
  installed and symlinked as follows:

    (library) libblis-0.3.2-15-haswell.a
    (library) libblis-0.3.2-15-haswell.so
    (symlink) libblis.a -> libblis-0.3.2-15-haswell.a
    (symlink) libblis.so -> libblis-0.3.2-15-haswell.so

  we now use the following naming conventions:

    (library) libblis.a
    (symlink) libblis.so -> libblis.so.0.1.2
    (symlink) libblis.so.0 -> libblis.so.0.1.2
    (library) libblis.so.0.1.2

  where 0.1.2 indicates shared library major, minor, and build versions
  of 0, 1, and 2, respectively. The conventional version string can
  still be queried by linking to the library in question and then calling
  bli_info_get_version_str(). (The testsuite binary does this
  automatically at startup.)
- Added logic to common.mk to set the soname field in the shared library
  via the -soname linker flag.
- Added a 'so_version' file to the top-level directory containing two
  lines. The first line specifies the .so major version number, and the
  second line specifies the minor and build version numbers joined with
  a '.'. This file is read by configure and those values substituted
  into build/config.mk.in to define SO_MAJOR, SO_MINORB, and SO_MMB
  variables.
2018-05-10 15:54:17 -05:00
Field G. Van Zee
bf03503059 Renamed (shortened) a few build system variables.
Details:
- Renamed the following variables in config.mk (via build/config.mk.in):
    BLIS_ENABLE_VERBOSE_MAKE_OUTPUT -> ENABLE_VERBOSE
    BLIS_ENABLE_STATIC_BUILD        -> MK_ENABLE_STATIC
    BLIS_ENABLE_SHARED_BUILD        -> MK_ENABLE_SHARED
    BLIS_ENABLE_BLAS2BLIS           -> MK_ENABLE_BLAS
    BLIS_ENABLE_CBLAS               -> MK_ENABLE_CBLAS
    BLIS_ENABLE_MEMKIND             -> MK_ENABLE_MEMKIND
  and also renamed all uses of these variables in makefiles and makefile
  fragments. Notice that we use the "MK_" prefix so that those variables
  can be easily differentiated (such as via grep) from their "BLIS_" C
  preprocessor macro counterparts.
- Other whitespace changes to build/config.mk.in.
- Renamed the following C preprocessor macros in bli_config.h (via
  build/bli_config.h.in):
    BLIS_ENABLE_BLAS2BLIS        -> BLIS_ENABLE_BLAS
    BLIS_DISABLE_BLAS2BLIS       -> BLIS_DISABLE_BLAS
    BLIS_BLAS2BLIS_INT_TYPE_SIZE -> BLIS_BLAS_INT_TYPE_SIZE
  and also renamed all relevant uses of these macros in BLIS source
  files.
- Renamed "blas2blis" variable occurrences in configure to "blas", as
  was done in build/config.mk.in and build/bli_config.h.in.
- Renamed the following functions in frame/base/bli_info.c:
    bli_info_get_enable_blas2blis() -> bli_info_get_enable_blas()
    bli_info_get_blas2blis_int_type_size()
                                    -> bli_info_get_blas_int_type_size()
- Remove bli_config.h during 'make cleanh' target of top-level Makefile.
2018-05-08 16:49:22 -05:00
Field G. Van Zee
7e5648ca15 Add configure support for --libdir, --includedir.
Details:
- Added support for two new configure options: --libdir and --includedir.
  They specify the precise install directories for libraries and header
  files, respectively, and override any location implied by the --prefix
  option (including the default install prefix, if --prefix was not
  given). Thanks to Nico Schlömer for suggesting this via issue #195.
- Removed the INSTALL_PREFIX definition/anchor from build/config.mk.in
  and replaced it with corresponding definitions/anchors for libdir and
  includedir.
- Updated top-level Makefile to use the new variables, INSTALL_LIBDIR
  and INSTALL_INCDIR, instead of INSTALL_PREFIX (which is now no longer
  needed by make).
- Set default sane values for INSTALL_LIBDIR and INSTALL_INCDIR in
  common.mk when configure has not been run, as is already done for
  DIST_PATH. This is to safeguard against statements in the top-level
  Makefile that use 'find' to locate old libraries and headers for the
  uninstall targets, which run regardless of make target. Without setting
  INSTALL_LIBDIR and INSTALL_INCDIR, those variables are empty and the
  'find' ends up looking at '/', which is obviously not what we want.
  (Also enclosed those definitions in an IS_CONFIGURED guard so that they
  won't get evaluated unless configure has been run.)
- Rearranged "ifeq ($(IS_CONFIGURED),yes)" conditionals in Makefile to
  reduce occurrences and separated "local" and top-level components of
  cleanblastest and cleanblistest targets to improve readability.
- Adjusted out-of-tree builds so that they are no longer oblivious to
  the .git directories, if present, and thus now properly augment version
  strings with the appropriate patch number.
- Include missing version string in 'configure --help' output.
2018-05-07 18:59:19 -05:00
Field G. Van Zee
b09e4e8852 Allow 'make clean' and friends without configuring.
Details:
- Modified top-level Makefile so that a user can run 'make distclean',
  'make clean', or any of the other clean-related targets prior to
  running configure (or after a previous 'make distclean'). Thanks to
  Nico Schlömer for suggesting this via issue #197.
- Made the cleanblastest and cleanblistest more comprehensive in that
  they now clean out build products that would have resulted from local
  compilation (ie: builds performed within the 'blastest' or 'testsuite'
  directories).
- Added "cc" to list of expected compiler "vendors" since the CC variable
  seems to automatically be set to "cc" on Ubuntu 16.04 (which is just an
  alias to gcc).
- Comment update to build/config.mk.in.
2018-05-07 14:37:50 -05:00
Field G. Van Zee
35c5a1449c No longer update version file during configure.
Details:
- Recycled the core functionality of build/update-version-file.sh into a
  function in configure, disabling the updating of the 'version' file in
  the process. Instead of writing the patched version string back to the
  version file and then reading it again from within configure, the
  patched version string is now saved directly to a variable in the main()
  function in configure. This will prevent developers from accidentally
  committing configure-induced changes to the version file in between
  releases.
2018-05-07 12:04:57 -05:00
Mathieu Poumeyrol
8adb2f919b Some cross compilations fixes (#198)
* cross-compilation fixes
* add doc ranlib variable
* icc support -dumpversion, posix compatible test, plus one stupid mistake
* retab
* revert version as requested
2018-05-06 12:58:16 -05:00
Field G. Van Zee
cdf041ddad Use config.mk instead of common.mk in bump-version.sh.
Details:
- Fixed inadvertent targeting of common.mk when testing whether configure
  had already been run, rather than config.mk.
2018-04-28 14:05:00 -05:00
Field G. Van Zee
6ded8f9f03 Account for recent 'make distclean' in bump-version.sh.
Details:
- Added logic to build/bump-version.sh that will run './configure auto'
  if 'common.mk' is not present (usually because 'make distclean' was run
  recently).
2018-04-28 14:01:29 -05:00
Field G. Van Zee
088c474e62 Added support for blacklisting via the assembler.
Details:
- Added logic to configure that attempts to assemble various small files
  containing select instructions designed to reveal whether binutils
  (specifically, the assembler) supports emitting those instruction sets.
  This information provides additional opportunities to blacklist sub-
  configurations that are unsupported by the environment. Thanks to Devin
  Matthews for pointing me towards a similar solution in TBLIS as an
  example.
- Various other cleanups in configure.
- Reorganized the detection code in the 'build' directory, bringing the
  "auto-detect" configuration detection, libmemkind detection, and new
  instruction set detection codes into a single new subdirectory named
  'detect'.
2018-04-10 18:09:56 -05:00
Field G. Van Zee
08b123084d Added color-coding to 'make check' output.
Details:
- Added color coding to output of check-blistest.sh, check-blastest.sh
  scripts. Success messages are coded green and failure are coded red.
  This helps draw the eye toward those messages as the 'make checkblis',
  'make checkblis-fast', and 'make checkblas' targets are executed.
- Changed top-level Makefile so that execution will not halt if
  'checkblis', 'checkblis-fast', or 'checkblas' targets fail, which
  means that the second of the two tests (BLIS and BLAS) run by
  'make check' will run even if the first test fails.
2018-04-05 14:25:39 -05:00
Field G. Van Zee
22289ad23c Added build system support for libmemkind.
Details:
- Added support for libmemkind to configure. configure attempts to
  detect the presence of libmemkind by compiling a small program
  containing #include <hbwmalloc.h> and a call to hbw_malloc(). If
  successful, it is assumed that libmemkind is present and available.
  If present, use of libmemkind is enabled by default, and otherwise
  use is disabled by default. If libmemkind is present, the user may
  explicitly disable use of the library by running configure with the
  --without-memkind option. Furthermore, a configuration may disable
  libmemkind, perhaps conditional on some aspect of the build system,
  by including -DBLIS_DISABLE_MEMKIND in the configuration's CPPROCFLAGS
  make variable and setting the BLIS_ENABLE_MEMKIND makefile variable,
  set in config.mk, to 'no'. (The knl configuration makes use of this
  latter feature; see below.)
- If enabled at configure-time, bli_system.h will #include <hbwmalloc.h>
  and bli_kernel_macro_defs.h will define BLIS_MALLOC_POOL and
  BLIS_FREE_POOL to use hbw_malloc() and hbw_free(), respectively.
- Deprecated explicit use of BLIS_NO_HBWMALLOC in
  config/knl/bli_family.knl.h and replaced use of -DBLIS_NO_HBWMALLOC in
  config/knl/make_defs.mk with -DBLIS_DISABLE_MEMKIND, which overrides
  (#undefs) the definition of BLIS_ENABLE_MEMKIND in bli_system.h, if it
  would otherwise be defined. Also, set the BLIS_ENABLE_MEMKIND makefile
  variable to 'no'.
- common.mk now adds libmemkind to LDFLAGS if libmemkind is enabled.
2018-03-22 18:21:30 -05:00
Field G. Van Zee
7dc40eafdd Updates to top-level and test driver Makefiles.
Details:
- Added logic to common.mk that will choose a BLIS library against which
  to link (LIBBLIS_LINK). The default choice is the static (.a) library;
  the shared (.so) library is chosen only if the shared library build was
  enabled and the static one was disabled.
- Updated the various test driver Makefiles to reference this common,
  pre-chosen library against which to link. (Previously, these drivers
  unconditionally linked against the static library and would have
  failed if the static library build was disabled at configure-time.)
- Renamed many of the variables in common.mk and the top-level Makefile
  so that variables relating to the libblis.[a|so] files, including
  paths to those files, begin with "LIBBLIS".
- Shuffled around some of the library definitions from the top-level
  Makefile to common.mk.
- Renamed BLIS_ENABLE_DYNAMIC_BUILD to BLIS_ENABLE_SHARED_BUILD, and
  the @enable_dynamic@ anchor to @enable_shared@ in build/config.mk.in
  and in configure.
- A few other cleanups in the top-level Makefile.
2018-03-21 18:39:16 -05:00
Field G. Van Zee
664ec4813d Integrated f2c'ed netlib BLAS test suite.
Details:
- Created a new test suite that exercises only the BLAS compatibility
  found in BLIS. The test suite is a straightforward port of code
  obtained from netlib LAPACK, run through f2c and linked to a stripped-
  down version of libf2c that is compiled along with the test drivers
  (to prevent any obvious ABI issues). The new BLAS test suite can be
  run from within its new local directory, 'blastest' (through its local
  'make ; make run' targets) or from the top-level Makefile (via the
  'make testblas' target). Output files are created in whatever directory
  the test drivers are run, whether it be the 'blastest' directory, the
  top-level source distribution directory, or the out-of-tree directory
  in which 'configure' was run. Also, the results of the BLAS test suite
  can be checked via 'make checkblas', which summarizes the presence or
  absence of test failures in a single line printed to stdout.
- Updated the 'test' target to run both 'testblis' and 'testblas'.
- Added a new 'testblis-fast' target that runs the BLIS testsuite with
  smaller problem sizes, allowing it to finish more quickly.
- Added a 'make check' target, which runs 'checkblis-fast' and
  'checkblas'.
- Changed .travis.yml so that Travis CI runs 'testblis-fast' instead of
  'testblis' before (calling the check-blistest.sh script to check the
  result manually).
- Renamed some targets in the top-level Makefile to be consistent between
  BLAS and BLIS.
2018-03-20 13:54:58 -05:00
Field G. Van Zee
16813335bd Merge branch 'amd' into rt
Details:
- Merged contributions made by AMD via 'amd' branch (see summary below).
  Special thanks to AMD for their contributions to-date, especially with
  regard to intrinsic- and assembly-based kernels.
- Added column storage output cases to microkernels in
  bli_gemm_zen_asm_d6x8.c and bli_gemmtrsm_l_zen_asm_d6x8.c. Even with
  the extra cost of transposing the microtile in registers, this is
  much faster than using the general storage case when the underlying
  matrix is column-stored.
- Added s and d assembly-based zen gemmtrsm_u microkernel (including
  column storage optimization mentioned above).
- Updated zen sub-configuration to reflect presence of new native
  kernels.
- Temporarily reverted zen sub-configuration's level-3 cache blocksizes
  to smaller haswell values.
- Temporarily disabled small matrix handling for zen configuration
  family in config/zen/bli_family_zen.h.
- Updated zen CFLAGS according to changes in 1e4365b.
- Updated haswell microkernels such that:
  - only one vzeroupper instruction is called prior to returning
  - movapd/movupd are used in leiu of movaps/movups for double-real
    microkernels. (Note that single-real microkernels still use
    movaps/movups.)
- Added kernel prototypes to kernels/zen/bli_kernels_zen.h, which is
  now included via frame/include/bli_arch_config.h.
- Minor updates to bli_amaxv_ref.c (and to inlined "test" implementation
  in testsuite/src/test_amaxv.c).
- Added early return for alpha == 0 in bli_dotxv_ref.c.
- Integrated changes from f07b176, including a fix for undefined
  behavior when executing the 1m method under certain conditions.
- Updated config_registry; no longer need haswell kernels for zen
  sub-configuration.
- Tweaked marginal and pass thresholds for dotxf.
- Reformatted level-1v, -1f, and -3 amd kernels and inserted additional
  comments.
- Updated LICENSE file to explicitly mention that parts are copyright
  UT-Austin and AMD.
- Added AMD copyright to header templates in build/templates.

Summary of previous changes from 'amd' branch.
- Added s and d assembly-based zen gemm microkernels (d6x8 and d8x6) and
  s and d assembly-based zen gemmtrsm_l microkernels (d6x8).
- Added s and d intrinsics-based zen kernels for amaxv, axpyv, dotv, dotxv,
  and scalv, with extra-unrolling variants for axpyv and scalv.
- Added a small matrix handler to bli_gemm_front(), with the handler
  implemented in kernels/zen/3/bli_gemm_small_matrix.c.
- Added additional logic to sumsqv that first attempts to compute the
  sum of the squares via dotv(). If there is a floating-point exception
  (FE_OVERFLOW), then the previous (numerically conservative) code is
  used; otherwise, the result of dotv() is square-rooted and stored as
  the result. This new implementation is only enabled when FE_OVERFLOW
  is #defined. If the macro is not #defined, then the previous
  implementation is used.
- Added axpyv and dotv standalone test drivers to test directory.
- Added zen support to old cpuid_x86.c driver in build/auto-detect/old.
- Added thread-local and __attribute__-related macros to bli_macro_defs.h.
2018-02-21 17:43:32 -06:00
Field G. Van Zee
1e7a4896e0 Minor error handling in update-version-file.sh.
Details:
- Added explicit handling of situations when 'git describe --tags'
  returns an error. This command is used by update-version-file.sh
  when deciding whether or not to update the version file prior to
  configuration.
- Removed bli_packm.c and bli_unpackm.c, as they contained no source
  code.
2018-01-05 12:33:48 -06:00
Field G. Van Zee
0b9c5127e9 Enabled C99, added stdint.h to auto-detect build.
Details:
- Added "-std=c99" to compiler arguments when building auto-detection
  driver in configure script.
- Added #include <stdint.h> to all three source files needed by auto-
  detection program.
2017-12-23 15:53:44 -06:00
Field G. Van Zee
0ce5e19c31 Reimplemented configure-time hardware detection.
Details:
- Reimplemented the hardware detection functionality invoked when running
  "./configure auto". Previously, a standalone script in build/auto-detect
  that used CPUID was used. However, the script attempted to enumerate all
  models for each microarchitecture supported. The new approach recycles
  the same code used for runtime hardware detection introduced in 2c51356.
  This has two immediate benefits. First, it reduces and consolidates the
  code required to detect microarchitectures via the CPUID instruction.
  Second, it provides an indirect way of testing at configure-time the
  code that is used to detect hardware at runtime. This code is (a) only
  activated when targeting a configuration family (such as intel64 or
  amd64) at configure-time and (b) somewhat difficult to test in
  practice, since it relies on having access to older microarchitectures.
- The above change required placing conditional cpp macro blocks in
  bli_arch.c and bli_cpuid.c which either #include "blis.h" or #include
  a bare-bones set of headers that does not rely on the presence of a
  bli_config.h header. This is needed because bli_config.h has not been
  created yet when configure-time auto-detection takes places.
- Defined a new function in bli_arch.c, bli_arch_string(), which takes
  an arch_t id and returns a pointer to a string that contains the
  lowercase name of the corresponding microarchitecture. This function
  is used by the auto-detection script to printf() the name of the
  sub-configuration corresponding to the detected hardware.
2017-12-23 15:32:03 -06:00
Field G. Van Zee
9804adfd40 Added option to disable pack buffer memory pools.
Details:
- Added a new configure option, --[en|dis]able-packbuf-pools, which will
  enable or disable the use of internal memory pools for managing buffers
  used for packing. When disabled, the function specified by the cpp
  macro BLIS_MALLOC_POOL is called whenever a packing buffer is needed
  (and BLIS_FREE_POOL is called when the buffer is ready to be released,
  usually at the end of a loop). When enabled, which was the status quo
  prior to this commit, a memory pool data structure is created and
  managed to provide threads with packing buffers. The memory pool
  minimizes calls to bli_malloc_pool() (i.e., the wrapper that calls
  BLIS_MALLOC_POOL), but does so through a somewhat more complex
  mechanism that may incur additional overhead in some (but not all)
  situations. The new option defaults to --enable-packbuf-pools.
- Removed the reinitialization of the memory pools from the level-3
  front-ends and replaced it with automatic reinitialization within the
  pool API's implementation. This required an extra argument to
  bli_pool_checkout_block() in the form of a requested size, but hides
  the complexity entirely from BLIS. And since bli_pool_checkout_block()
  is only ever called within a critical section, this change fixes a
  potential race condition in which threads using contexts with different
  cache blocksizes--most likely a heterogeneous environment--can check
  out pool blocks that are too small for the submatrices it wishes to
  pack. Thanks to Nisanth Padinharepatt for reporting this potential
  issue.
- Removed several functions in light of the relocation of pool reinit,
  including bli_membrk_reinit_pools(), bli_memsys_reinit(),
  bli_pool_reinit_if(), and bli_check_requested_block_size_for_pool().
- Updated the testsuite to print whether the memory pools are enabled or
  disabled.
2017-12-21 19:22:57 -06:00
Field G. Van Zee
0084531d3e Updated flatten-headers.py for python3.
Details:
- Modifed flatten-headers.py to work with python 3.x. This mostly
  amounted to removing print statements (which I replaced with calls
  to my_print(), a wrapper to sys.stdout.write()). Thanks to Stefan
  Husmann for pointing out the script's incompatibility with python 3.
- Other minor changes/cleanups.
2017-12-17 18:58:25 -06:00
Field G. Van Zee
90b11b79c3 Modest performance boost to flatten-headers.py.
Details:
- Updated flatten-headers.py to pre-compile the main regular expression
  used to isolate #include directives and the header filenames they
  reference. The compiled regex object is then used over and over on
  each header file in the tree of referenced headers. This appears to
  have provided a 1.7-2x performance increase in the best case.
- Other minor tweaks, such as renaming the main recursive function from
  replace_pass() to flatten_header().
2017-12-17 17:34:32 -06:00
Field G. Van Zee
99dee87f30 Reimplemented flatten-headers.sh in python.
Details:
- Added flatten-headers.py, a python implementation of the bash script
  flatten-headers.sh. The new script appears to be 25-100x faster,
  depending on the operating system, filesystem, etc. The python script
  abides by the same command line interface as its predecessor and
  targets python 2.7 or later. (Thanks to Devin Matthews for suggesting
  that I look into a python replacement for higher performance.)
- Activated use of flatten-headers.py in common.mk via the FLATTEN_H
  variable.
- Made minor tweaks to flatten-headers.sh such as spelling corrections
  in comments.
2017-12-17 16:47:27 -06:00
Field G. Van Zee
6526d1d4ae Added temp_dir argument to flatten-headers.sh.
Details:
- Added "temp_dir" argument to flatten-headers.sh so that the caller can
  specify where intermediate files should be created as the script runs.
- Updated flatten-headers.sh to create intermediate files in temp_dir
  instead of alongside the corresponding source files. This should now
  (once again) allow out-of-tree builds where the BLIS distribution is
  read-only, or where the out-of-tree build is running concurrently with
  another out-of-tree build. (Thanks to Devin Matthews for pointing out
  the possibility of simultaneous out-of-tree builds.)
2017-12-12 13:50:43 -06:00
Field G. Van Zee
244a6f4e66 Fixed POSIX sed non-compliance in flatten-header.sh.
Details:
- Changed GNU usage of 'i' and 'a' sed commands used in flatten-header.sh
  to POSIX-compliant usage that will work on OS X's sed.
2017-11-28 17:48:48 -06:00
Field G. Van Zee
4507862167 Generate/compile with/install monolithic blis.h.
Details:
- Rewrote monolithify-header.sh (and renamed to flatten-header.sh) so that
  headers are inserted recursively. This improves performance by a factor
  of 3-4x.
- Modified configure to create an 'include/<configname>' directory in which
  make can create a monolithic header.
- Modified the top-level Makefile so that a monolithic header is generated
  unconditionally prior to compilation (stored in include/<configname>) and
  so that the single header is installed instead of the 450 or so header
  files that reside throughout the framework source tree.
- Added "include/*/*.h" to .gitignore file.
- Removed some pnacl/emscripten leftovers that I intended to include in
  a1caeba (mostly in testsuite/Makefile).
- Trivial comment changes to frame/include/bli_f2c.h.
2017-11-28 15:16:22 -06:00
Field G. Van Zee
ef024ce4ca More tweaks to monolithify-header.sh
Details:
- Further fixes monolithify-header.sh script.
- Removed unnecessary #include "blis.h" from frame/3/bli_l3_packm.h.
2017-11-20 18:08:29 -06:00
Field G. Van Zee
9df6dda9ec Improvements, bugfixes to monolithify-header.sh. 2017-11-18 19:03:26 -06:00
Field G. Van Zee
870597d166 Added bash script for creating monolithic headers.
Details:
- Added a new script, monolithify-header.sh, to the 'build' directory.
  This script recursively replaces all #include directives in a selected
  file with the contents of the header files referenced by each directive.
  The idea is to "flatten" a tree of .h files into a single file, with
  the script acting as a C preprocessor that only processes #include
  directives.
2017-11-17 17:06:42 -06:00
Field G. Van Zee
d5bf79e50b Miscellaneous tweaks and fixes.
Details:
- Fixed incorrect calling sequence in bli_cntx_init_knl.c--an instance of
  bli_blksz_init_easy() that should have been bli_blksz_init().
- Fixed a bug in code that is supposed to output the list of sub-directories
  in the 'config' directory when configure script is run with no arguments.
- Expanded the output of "make showconfig" to include more info from config.mk.
- Minor changes to build/auto-detect/cpuid_x86.c, mostly in preparation for
  someone to add excavator and zen support.
- Added a link to the ConfigurationHowTo wiki to config_registry.
- Other minor tweaks to configure.
2017-11-13 14:24:29 -06:00
Field G. Van Zee
07c352188b Added "generic" configuration.
Details:
- Added a "generic" configuration that leaves the default blocksizes and
  kernels unchanged. This replaces the older "reference" configuration.
  Updated auto-detect script and code accordingly.
- Added support for generic configuration to arch_t (bli_type_defs.h),
  bli_gks_init() (bli_gks.c), and bli_arch_config.h
- Moved bli_arch_query_id() to bli_arch.c (and prototype to bli_arch.h).
- Whitespace changes to configurations' make_defs.mk files.
2017-10-23 16:59:22 -05:00
Field G. Van Zee
3c269f700d Makefile updates for test drivers, testsuite.
Details:
- Fixed semi-broken testsuite Makefile and very-broken test driver Makefiles,
  as well as those for test/3m4m, test/thread_ranges, and test/exec_sizes
  sub-directories.
- Factored out much of the top-level Makefile into common.mk. A Makefile
  needs only set DIST_PATH to the relative path to the top level of the
  BLIS source distribution before including common.mk in order to acquire
  all of the definitions typically needed in a Makefile that tests BLIS.
2017-10-20 13:57:21 -05:00
Field G. Van Zee
453deb2906 Implemented runtime kernel management.
Details:
- Reworked the build system around a configuration registry file, named
  config_registry', that identifies valid configuration targets, their
  constituent sub-configurations, and the kernel sets that are needed by
  those sub-configurations. The build system now facilitates the building
  of a single library that can contains kernels and cache/register
  blocksizes for multiple configurations (microarchitectures). Reference
  kernels are also built on a per-configuration basis.
- Updated the Makefile to use new variables set by configure via the
  config.mk.in template, such as CONFIG_LIST, KERNEL_LIST, and KCONFIG_MAP,
  in determining which sub-configurations (CONFIG_LIST) and kernel sets
  (KERNEL_LIST) are included in the library, and which make_defs.mk files'
  CFLAGS (KCONFIG_MAP) are used when compiling kernels.
- Reorganized 'kernels' directory into a "flat" structure. Renamed kernel
  functions into a standard format that includes the kernel set name
  (e.g. 'haswell'). Created a "bli_kernels_<kernelset>.h" file in each
  kernels sub-directory. These files exist to provide prototypes for the
  kernels present in those directories.
- Reorganized reference kernels into a top-level 'ref_kernels' directory.
  This directory includes a new source file, bli_cntx_ref.c (compiled on
  a per-configuration basis), that defines the code needed to initialize
  a reference context and a context for induced methods for the
  microarchitecture in question.
- Rewrote make_defs.mk files in each configuration so that the compiler
  variables (e.g. CFLAGS) are "stored" (renamed) on a per-configuration
  basis.
- Modified bli_config.h.in template so that bli_config.h is generated with
  #defines for the config (family) name, the sub-configurations that are
  associated with the family, and the kernel sets needed by those
  sub-configurations.
- Deprecated all kernel-related information in bli_kernel.h and transferred
  what remains to new header files named "bli_arch_<configname>.h", which
  are conditionally #included from a new header bli_arch.h. These files
  are still needed to set library-wide parameters such as custom
  malloc()/free() functions or SIMD alignment values.
- Added bli_cntx_init_<configname>.c files to each configuration directory.
  The files contain a function, named the same as the file, that initializes
  a "native" context for a particular configuration (microarchitecture). The
  idea is that optimized kernels, if available, will be initialized into
  these contexts. Other fields will retain pointers to reference functions,
  which will be compiled on a per-configuration basis. These bli_cntx_init_*()
  functions will be called during the initialization of the global kernel
  structure. They are thought of as initializing for "native" execution, but
  they also form the basis for contexts that use induced methods. These
  functions are prototyped, along with their _ref() and _ind() brethren, by
  prototype-generating macros in bli_arch.h.
- Added a new typedef enum in bli_type_defs.h to define an arch_t, which
  identifies the various sub-configurations.
- Redesigned the global kernel structure (gks) around a 2D array of cntx_t
  structures (pointers to cntx_t, actually). The first dimension is indexed
  over arch_t and the inner dimension is the ind_t (induced method) for
  each microarchitecture. When a microarchitecture (configuration) is
  "registered" at init-time, the inner array for that configuration in the
  2D array is initialized (and allocated, if it hasn't been already). The
  cntx_t slot for BLIS_NAT is initialized immediately and those for other
  induced method types are initialized and cached on-demand, as needed. At
  cntx_t registration, we also store function pointers to cntx_init functions
  that will initialize (a) "reference" contexts and (b) contexts for use with
  induced methods. We don't cache the full contexts for reference contexts
  since they are rarely needed. The functions that initialize these two kinds
  of contexts are generated automatically for each targeted sub-configuration
  from cpp-templatized code at compile-time. Induced method contexts that
  need "stage" adjustments can still obtain them via functions in
  bli_cntx_ind_stage.c.
- Added new functions and functionality to bli_cntx.c, such as for setting
  the level-1f, level-1v, and packm kernels, and for converting a native
  context into one for executing an induced method.
- Moved the checking of register/cache blocksize consistency from being cpp
  macros in bli_kernel_macro_defs.h to being runtime checks defined in
  bli_check.c and called from bli_gks_register_cntx() at the time that the
  global kernel structure's internal context is initialized for a given
  microarchitecture/configuration.
- Deprecated all of the old per-operation bli_*_cntx.c files and removed
  the previous operation-level cntx_t_init()/_finalize() invocations.
  Instead, we now query the gks for a suitable context, usually via
  bli_gks_query_cntx().
- Deprecated support for the 3m2 and 3m3 induced methods. (They required
  hackery that I was no longer willing to support.)
- Consolidated the 1e and 1r packm kernels for any given register blocksize
  into a single kernel that will branch on the schema and support packing
  to both formats.
- Added the cntx_t* argument to all packm kernel signatures.
- Deprecated the local function pointer array in all bli_packm_cxk*.c files
  and instead obtain the packm kernel from the cntx_t.
- Added bli_calloc_intl(), which serves as the calloc-equivalent to to
  bli_malloc_intl(). Useful when we wish to allocate and initialize to
  zero/NULL.
- Converted existing cpp macro functions defined in bli_blksz.h, bli_func.h,
  bli_cntx.h into static functions.
2017-10-18 13:29:32 -05:00
Nisanth M P
7be8870573 Merging "Adding auto hardware detection for Zen"
Change-Id: Id450fb0c4f91a5cd5cbdc06970f4f9ed28dd8520
2017-09-07 19:49:00 +05:30
Nisanth M P
57e1e5cd51 Merge AMD authored changes 2017-08-22 17:07:44 +05:30
Field G. Van Zee
5caaba2d61 Added --force-version=STRING option to configure.
Details:
- Added an option to configure that allows the user to force an arbitrary
  version string at configure-time. The help text also now describes the
  usage information.
- Changed the way the version string is communicated to the Makefile.
  Previously, it was read into the VERSION variable from the 'version' file
  via $(shell cat ...). Now, the VERSION variable is instead set in
  config.mk (via a configure-substituted anchor from config.mk.in).
2017-07-19 13:51:53 -05:00
prangana
9d93f8481a Update Licence File
Change-Id: I4c5cf1690d0cef92a68400f9a89e454ab6856ad2
2017-05-30 14:00:03 +05:30
Field G. Van Zee
6e04f9df01 Restored deleted lines from makefile fragments. 2017-05-17 13:03:52 -05:00
Devin Matthews
ec5c0c0448 Change to /bin/sh.
All scripts checked with Debian's checkbashisms. Also check for clang first in auto-detect.sh.
2017-05-17 12:29:44 -05:00
Devin Matthews
555ddc30d4 Remove shebangs from makefiles. 2017-05-17 12:27:14 -05:00
J M Dieterich
5fa4e9439c A bunch of shebang fixes from unportable /bin/bash to portable /usr/bin/env bash 2017-05-16 21:50:49 -04:00
Devin Matthews
8f9010542c Fix some problems with OSX builds:
- Update CPU detection for Intel archs (esp. Skylake)
- Allow clang for the reference config
2016-11-02 11:18:32 -05:00
Field G. Van Zee
50293da38d Avoid compiling BLAS/CBLAS files when disabled.
Details:
- Updated the top-level Makefile, build/config.mk.in template, and
  configure script so that object files corresponding to source files
  belonging to the BLAS compatibility layer are not compiled (or archived)
  when the compatibility layer is disabled. (Same for CBLAS.) Thanks
  to Devin Matthews for suggesting this optimization.
- Slight change to the way configure handles internal variables. Instead
  of converting (overwriting) some, such as enable_blas2blis and
  enable_cblas, from a "yes" or "no" to a "1" or "0" value, the latter are
  now stored in new variables that live alongside the originals (with the
  suffix "_01").  This is convenient since some values need to be
  sed-substituted into the config.mk.in template, which requires "yes" or
  "no", while some need to be written to the bli_config.h.in template,
  which requires "0" or "1".
2016-08-23 13:38:36 -05:00
Devin Matthews
0e1a9821d8 Add configure options and generate bli_config.h automatically.
Options to configure have been added for:
- Setting the internal BLIS and BLAS/CBLAS integer sizes.
- Enabling and disabling the BLAS and CBLAS layers.

Additionally, configure options which require defining macros (the above plus the threading model), write their macros to the automatically-generated bli_config.h file in the top-level build directory. The old bli_config.h files in the config dirs were removed, and any kernel-related macros (SIMD size and alignment etc.) were moved to bli_kernel.h. The Makefiles were also modified to find the new bli_config.h file.

Lastly, support for OMP in clang has been added (closes #56).
2016-04-19 11:44:37 -05:00