Commit Graph

57 Commits

Author SHA1 Message Date
Field G. Van Zee
16813335bd Merge branch 'amd' into rt
Details:
- Merged contributions made by AMD via 'amd' branch (see summary below).
  Special thanks to AMD for their contributions to-date, especially with
  regard to intrinsic- and assembly-based kernels.
- Added column storage output cases to microkernels in
  bli_gemm_zen_asm_d6x8.c and bli_gemmtrsm_l_zen_asm_d6x8.c. Even with
  the extra cost of transposing the microtile in registers, this is
  much faster than using the general storage case when the underlying
  matrix is column-stored.
- Added s and d assembly-based zen gemmtrsm_u microkernel (including
  column storage optimization mentioned above).
- Updated zen sub-configuration to reflect presence of new native
  kernels.
- Temporarily reverted zen sub-configuration's level-3 cache blocksizes
  to smaller haswell values.
- Temporarily disabled small matrix handling for zen configuration
  family in config/zen/bli_family_zen.h.
- Updated zen CFLAGS according to changes in 1e4365b.
- Updated haswell microkernels such that:
  - only one vzeroupper instruction is called prior to returning
  - movapd/movupd are used in leiu of movaps/movups for double-real
    microkernels. (Note that single-real microkernels still use
    movaps/movups.)
- Added kernel prototypes to kernels/zen/bli_kernels_zen.h, which is
  now included via frame/include/bli_arch_config.h.
- Minor updates to bli_amaxv_ref.c (and to inlined "test" implementation
  in testsuite/src/test_amaxv.c).
- Added early return for alpha == 0 in bli_dotxv_ref.c.
- Integrated changes from f07b176, including a fix for undefined
  behavior when executing the 1m method under certain conditions.
- Updated config_registry; no longer need haswell kernels for zen
  sub-configuration.
- Tweaked marginal and pass thresholds for dotxf.
- Reformatted level-1v, -1f, and -3 amd kernels and inserted additional
  comments.
- Updated LICENSE file to explicitly mention that parts are copyright
  UT-Austin and AMD.
- Added AMD copyright to header templates in build/templates.

Summary of previous changes from 'amd' branch.
- Added s and d assembly-based zen gemm microkernels (d6x8 and d8x6) and
  s and d assembly-based zen gemmtrsm_l microkernels (d6x8).
- Added s and d intrinsics-based zen kernels for amaxv, axpyv, dotv, dotxv,
  and scalv, with extra-unrolling variants for axpyv and scalv.
- Added a small matrix handler to bli_gemm_front(), with the handler
  implemented in kernels/zen/3/bli_gemm_small_matrix.c.
- Added additional logic to sumsqv that first attempts to compute the
  sum of the squares via dotv(). If there is a floating-point exception
  (FE_OVERFLOW), then the previous (numerically conservative) code is
  used; otherwise, the result of dotv() is square-rooted and stored as
  the result. This new implementation is only enabled when FE_OVERFLOW
  is #defined. If the macro is not #defined, then the previous
  implementation is used.
- Added axpyv and dotv standalone test drivers to test directory.
- Added zen support to old cpuid_x86.c driver in build/auto-detect/old.
- Added thread-local and __attribute__-related macros to bli_macro_defs.h.
2018-02-21 17:43:32 -06:00
Field G. Van Zee
1e7a4896e0 Minor error handling in update-version-file.sh.
Details:
- Added explicit handling of situations when 'git describe --tags'
  returns an error. This command is used by update-version-file.sh
  when deciding whether or not to update the version file prior to
  configuration.
- Removed bli_packm.c and bli_unpackm.c, as they contained no source
  code.
2018-01-05 12:33:48 -06:00
Field G. Van Zee
0b9c5127e9 Enabled C99, added stdint.h to auto-detect build.
Details:
- Added "-std=c99" to compiler arguments when building auto-detection
  driver in configure script.
- Added #include <stdint.h> to all three source files needed by auto-
  detection program.
2017-12-23 15:53:44 -06:00
Field G. Van Zee
0ce5e19c31 Reimplemented configure-time hardware detection.
Details:
- Reimplemented the hardware detection functionality invoked when running
  "./configure auto". Previously, a standalone script in build/auto-detect
  that used CPUID was used. However, the script attempted to enumerate all
  models for each microarchitecture supported. The new approach recycles
  the same code used for runtime hardware detection introduced in 2c51356.
  This has two immediate benefits. First, it reduces and consolidates the
  code required to detect microarchitectures via the CPUID instruction.
  Second, it provides an indirect way of testing at configure-time the
  code that is used to detect hardware at runtime. This code is (a) only
  activated when targeting a configuration family (such as intel64 or
  amd64) at configure-time and (b) somewhat difficult to test in
  practice, since it relies on having access to older microarchitectures.
- The above change required placing conditional cpp macro blocks in
  bli_arch.c and bli_cpuid.c which either #include "blis.h" or #include
  a bare-bones set of headers that does not rely on the presence of a
  bli_config.h header. This is needed because bli_config.h has not been
  created yet when configure-time auto-detection takes places.
- Defined a new function in bli_arch.c, bli_arch_string(), which takes
  an arch_t id and returns a pointer to a string that contains the
  lowercase name of the corresponding microarchitecture. This function
  is used by the auto-detection script to printf() the name of the
  sub-configuration corresponding to the detected hardware.
2017-12-23 15:32:03 -06:00
Field G. Van Zee
9804adfd40 Added option to disable pack buffer memory pools.
Details:
- Added a new configure option, --[en|dis]able-packbuf-pools, which will
  enable or disable the use of internal memory pools for managing buffers
  used for packing. When disabled, the function specified by the cpp
  macro BLIS_MALLOC_POOL is called whenever a packing buffer is needed
  (and BLIS_FREE_POOL is called when the buffer is ready to be released,
  usually at the end of a loop). When enabled, which was the status quo
  prior to this commit, a memory pool data structure is created and
  managed to provide threads with packing buffers. The memory pool
  minimizes calls to bli_malloc_pool() (i.e., the wrapper that calls
  BLIS_MALLOC_POOL), but does so through a somewhat more complex
  mechanism that may incur additional overhead in some (but not all)
  situations. The new option defaults to --enable-packbuf-pools.
- Removed the reinitialization of the memory pools from the level-3
  front-ends and replaced it with automatic reinitialization within the
  pool API's implementation. This required an extra argument to
  bli_pool_checkout_block() in the form of a requested size, but hides
  the complexity entirely from BLIS. And since bli_pool_checkout_block()
  is only ever called within a critical section, this change fixes a
  potential race condition in which threads using contexts with different
  cache blocksizes--most likely a heterogeneous environment--can check
  out pool blocks that are too small for the submatrices it wishes to
  pack. Thanks to Nisanth Padinharepatt for reporting this potential
  issue.
- Removed several functions in light of the relocation of pool reinit,
  including bli_membrk_reinit_pools(), bli_memsys_reinit(),
  bli_pool_reinit_if(), and bli_check_requested_block_size_for_pool().
- Updated the testsuite to print whether the memory pools are enabled or
  disabled.
2017-12-21 19:22:57 -06:00
Field G. Van Zee
0084531d3e Updated flatten-headers.py for python3.
Details:
- Modifed flatten-headers.py to work with python 3.x. This mostly
  amounted to removing print statements (which I replaced with calls
  to my_print(), a wrapper to sys.stdout.write()). Thanks to Stefan
  Husmann for pointing out the script's incompatibility with python 3.
- Other minor changes/cleanups.
2017-12-17 18:58:25 -06:00
Field G. Van Zee
90b11b79c3 Modest performance boost to flatten-headers.py.
Details:
- Updated flatten-headers.py to pre-compile the main regular expression
  used to isolate #include directives and the header filenames they
  reference. The compiled regex object is then used over and over on
  each header file in the tree of referenced headers. This appears to
  have provided a 1.7-2x performance increase in the best case.
- Other minor tweaks, such as renaming the main recursive function from
  replace_pass() to flatten_header().
2017-12-17 17:34:32 -06:00
Field G. Van Zee
99dee87f30 Reimplemented flatten-headers.sh in python.
Details:
- Added flatten-headers.py, a python implementation of the bash script
  flatten-headers.sh. The new script appears to be 25-100x faster,
  depending on the operating system, filesystem, etc. The python script
  abides by the same command line interface as its predecessor and
  targets python 2.7 or later. (Thanks to Devin Matthews for suggesting
  that I look into a python replacement for higher performance.)
- Activated use of flatten-headers.py in common.mk via the FLATTEN_H
  variable.
- Made minor tweaks to flatten-headers.sh such as spelling corrections
  in comments.
2017-12-17 16:47:27 -06:00
Field G. Van Zee
6526d1d4ae Added temp_dir argument to flatten-headers.sh.
Details:
- Added "temp_dir" argument to flatten-headers.sh so that the caller can
  specify where intermediate files should be created as the script runs.
- Updated flatten-headers.sh to create intermediate files in temp_dir
  instead of alongside the corresponding source files. This should now
  (once again) allow out-of-tree builds where the BLIS distribution is
  read-only, or where the out-of-tree build is running concurrently with
  another out-of-tree build. (Thanks to Devin Matthews for pointing out
  the possibility of simultaneous out-of-tree builds.)
2017-12-12 13:50:43 -06:00
Field G. Van Zee
244a6f4e66 Fixed POSIX sed non-compliance in flatten-header.sh.
Details:
- Changed GNU usage of 'i' and 'a' sed commands used in flatten-header.sh
  to POSIX-compliant usage that will work on OS X's sed.
2017-11-28 17:48:48 -06:00
Field G. Van Zee
4507862167 Generate/compile with/install monolithic blis.h.
Details:
- Rewrote monolithify-header.sh (and renamed to flatten-header.sh) so that
  headers are inserted recursively. This improves performance by a factor
  of 3-4x.
- Modified configure to create an 'include/<configname>' directory in which
  make can create a monolithic header.
- Modified the top-level Makefile so that a monolithic header is generated
  unconditionally prior to compilation (stored in include/<configname>) and
  so that the single header is installed instead of the 450 or so header
  files that reside throughout the framework source tree.
- Added "include/*/*.h" to .gitignore file.
- Removed some pnacl/emscripten leftovers that I intended to include in
  a1caeba (mostly in testsuite/Makefile).
- Trivial comment changes to frame/include/bli_f2c.h.
2017-11-28 15:16:22 -06:00
Field G. Van Zee
ef024ce4ca More tweaks to monolithify-header.sh
Details:
- Further fixes monolithify-header.sh script.
- Removed unnecessary #include "blis.h" from frame/3/bli_l3_packm.h.
2017-11-20 18:08:29 -06:00
Field G. Van Zee
9df6dda9ec Improvements, bugfixes to monolithify-header.sh. 2017-11-18 19:03:26 -06:00
Field G. Van Zee
870597d166 Added bash script for creating monolithic headers.
Details:
- Added a new script, monolithify-header.sh, to the 'build' directory.
  This script recursively replaces all #include directives in a selected
  file with the contents of the header files referenced by each directive.
  The idea is to "flatten" a tree of .h files into a single file, with
  the script acting as a C preprocessor that only processes #include
  directives.
2017-11-17 17:06:42 -06:00
Field G. Van Zee
d5bf79e50b Miscellaneous tweaks and fixes.
Details:
- Fixed incorrect calling sequence in bli_cntx_init_knl.c--an instance of
  bli_blksz_init_easy() that should have been bli_blksz_init().
- Fixed a bug in code that is supposed to output the list of sub-directories
  in the 'config' directory when configure script is run with no arguments.
- Expanded the output of "make showconfig" to include more info from config.mk.
- Minor changes to build/auto-detect/cpuid_x86.c, mostly in preparation for
  someone to add excavator and zen support.
- Added a link to the ConfigurationHowTo wiki to config_registry.
- Other minor tweaks to configure.
2017-11-13 14:24:29 -06:00
Field G. Van Zee
07c352188b Added "generic" configuration.
Details:
- Added a "generic" configuration that leaves the default blocksizes and
  kernels unchanged. This replaces the older "reference" configuration.
  Updated auto-detect script and code accordingly.
- Added support for generic configuration to arch_t (bli_type_defs.h),
  bli_gks_init() (bli_gks.c), and bli_arch_config.h
- Moved bli_arch_query_id() to bli_arch.c (and prototype to bli_arch.h).
- Whitespace changes to configurations' make_defs.mk files.
2017-10-23 16:59:22 -05:00
Field G. Van Zee
3c269f700d Makefile updates for test drivers, testsuite.
Details:
- Fixed semi-broken testsuite Makefile and very-broken test driver Makefiles,
  as well as those for test/3m4m, test/thread_ranges, and test/exec_sizes
  sub-directories.
- Factored out much of the top-level Makefile into common.mk. A Makefile
  needs only set DIST_PATH to the relative path to the top level of the
  BLIS source distribution before including common.mk in order to acquire
  all of the definitions typically needed in a Makefile that tests BLIS.
2017-10-20 13:57:21 -05:00
Field G. Van Zee
453deb2906 Implemented runtime kernel management.
Details:
- Reworked the build system around a configuration registry file, named
  config_registry', that identifies valid configuration targets, their
  constituent sub-configurations, and the kernel sets that are needed by
  those sub-configurations. The build system now facilitates the building
  of a single library that can contains kernels and cache/register
  blocksizes for multiple configurations (microarchitectures). Reference
  kernels are also built on a per-configuration basis.
- Updated the Makefile to use new variables set by configure via the
  config.mk.in template, such as CONFIG_LIST, KERNEL_LIST, and KCONFIG_MAP,
  in determining which sub-configurations (CONFIG_LIST) and kernel sets
  (KERNEL_LIST) are included in the library, and which make_defs.mk files'
  CFLAGS (KCONFIG_MAP) are used when compiling kernels.
- Reorganized 'kernels' directory into a "flat" structure. Renamed kernel
  functions into a standard format that includes the kernel set name
  (e.g. 'haswell'). Created a "bli_kernels_<kernelset>.h" file in each
  kernels sub-directory. These files exist to provide prototypes for the
  kernels present in those directories.
- Reorganized reference kernels into a top-level 'ref_kernels' directory.
  This directory includes a new source file, bli_cntx_ref.c (compiled on
  a per-configuration basis), that defines the code needed to initialize
  a reference context and a context for induced methods for the
  microarchitecture in question.
- Rewrote make_defs.mk files in each configuration so that the compiler
  variables (e.g. CFLAGS) are "stored" (renamed) on a per-configuration
  basis.
- Modified bli_config.h.in template so that bli_config.h is generated with
  #defines for the config (family) name, the sub-configurations that are
  associated with the family, and the kernel sets needed by those
  sub-configurations.
- Deprecated all kernel-related information in bli_kernel.h and transferred
  what remains to new header files named "bli_arch_<configname>.h", which
  are conditionally #included from a new header bli_arch.h. These files
  are still needed to set library-wide parameters such as custom
  malloc()/free() functions or SIMD alignment values.
- Added bli_cntx_init_<configname>.c files to each configuration directory.
  The files contain a function, named the same as the file, that initializes
  a "native" context for a particular configuration (microarchitecture). The
  idea is that optimized kernels, if available, will be initialized into
  these contexts. Other fields will retain pointers to reference functions,
  which will be compiled on a per-configuration basis. These bli_cntx_init_*()
  functions will be called during the initialization of the global kernel
  structure. They are thought of as initializing for "native" execution, but
  they also form the basis for contexts that use induced methods. These
  functions are prototyped, along with their _ref() and _ind() brethren, by
  prototype-generating macros in bli_arch.h.
- Added a new typedef enum in bli_type_defs.h to define an arch_t, which
  identifies the various sub-configurations.
- Redesigned the global kernel structure (gks) around a 2D array of cntx_t
  structures (pointers to cntx_t, actually). The first dimension is indexed
  over arch_t and the inner dimension is the ind_t (induced method) for
  each microarchitecture. When a microarchitecture (configuration) is
  "registered" at init-time, the inner array for that configuration in the
  2D array is initialized (and allocated, if it hasn't been already). The
  cntx_t slot for BLIS_NAT is initialized immediately and those for other
  induced method types are initialized and cached on-demand, as needed. At
  cntx_t registration, we also store function pointers to cntx_init functions
  that will initialize (a) "reference" contexts and (b) contexts for use with
  induced methods. We don't cache the full contexts for reference contexts
  since they are rarely needed. The functions that initialize these two kinds
  of contexts are generated automatically for each targeted sub-configuration
  from cpp-templatized code at compile-time. Induced method contexts that
  need "stage" adjustments can still obtain them via functions in
  bli_cntx_ind_stage.c.
- Added new functions and functionality to bli_cntx.c, such as for setting
  the level-1f, level-1v, and packm kernels, and for converting a native
  context into one for executing an induced method.
- Moved the checking of register/cache blocksize consistency from being cpp
  macros in bli_kernel_macro_defs.h to being runtime checks defined in
  bli_check.c and called from bli_gks_register_cntx() at the time that the
  global kernel structure's internal context is initialized for a given
  microarchitecture/configuration.
- Deprecated all of the old per-operation bli_*_cntx.c files and removed
  the previous operation-level cntx_t_init()/_finalize() invocations.
  Instead, we now query the gks for a suitable context, usually via
  bli_gks_query_cntx().
- Deprecated support for the 3m2 and 3m3 induced methods. (They required
  hackery that I was no longer willing to support.)
- Consolidated the 1e and 1r packm kernels for any given register blocksize
  into a single kernel that will branch on the schema and support packing
  to both formats.
- Added the cntx_t* argument to all packm kernel signatures.
- Deprecated the local function pointer array in all bli_packm_cxk*.c files
  and instead obtain the packm kernel from the cntx_t.
- Added bli_calloc_intl(), which serves as the calloc-equivalent to to
  bli_malloc_intl(). Useful when we wish to allocate and initialize to
  zero/NULL.
- Converted existing cpp macro functions defined in bli_blksz.h, bli_func.h,
  bli_cntx.h into static functions.
2017-10-18 13:29:32 -05:00
Nisanth M P
7be8870573 Merging "Adding auto hardware detection for Zen"
Change-Id: Id450fb0c4f91a5cd5cbdc06970f4f9ed28dd8520
2017-09-07 19:49:00 +05:30
Nisanth M P
57e1e5cd51 Merge AMD authored changes 2017-08-22 17:07:44 +05:30
Field G. Van Zee
5caaba2d61 Added --force-version=STRING option to configure.
Details:
- Added an option to configure that allows the user to force an arbitrary
  version string at configure-time. The help text also now describes the
  usage information.
- Changed the way the version string is communicated to the Makefile.
  Previously, it was read into the VERSION variable from the 'version' file
  via $(shell cat ...). Now, the VERSION variable is instead set in
  config.mk (via a configure-substituted anchor from config.mk.in).
2017-07-19 13:51:53 -05:00
prangana
9d93f8481a Update Licence File
Change-Id: I4c5cf1690d0cef92a68400f9a89e454ab6856ad2
2017-05-30 14:00:03 +05:30
Field G. Van Zee
6e04f9df01 Restored deleted lines from makefile fragments. 2017-05-17 13:03:52 -05:00
Devin Matthews
ec5c0c0448 Change to /bin/sh.
All scripts checked with Debian's checkbashisms. Also check for clang first in auto-detect.sh.
2017-05-17 12:29:44 -05:00
Devin Matthews
555ddc30d4 Remove shebangs from makefiles. 2017-05-17 12:27:14 -05:00
J M Dieterich
5fa4e9439c A bunch of shebang fixes from unportable /bin/bash to portable /usr/bin/env bash 2017-05-16 21:50:49 -04:00
Devin Matthews
8f9010542c Fix some problems with OSX builds:
- Update CPU detection for Intel archs (esp. Skylake)
- Allow clang for the reference config
2016-11-02 11:18:32 -05:00
Field G. Van Zee
50293da38d Avoid compiling BLAS/CBLAS files when disabled.
Details:
- Updated the top-level Makefile, build/config.mk.in template, and
  configure script so that object files corresponding to source files
  belonging to the BLAS compatibility layer are not compiled (or archived)
  when the compatibility layer is disabled. (Same for CBLAS.) Thanks
  to Devin Matthews for suggesting this optimization.
- Slight change to the way configure handles internal variables. Instead
  of converting (overwriting) some, such as enable_blas2blis and
  enable_cblas, from a "yes" or "no" to a "1" or "0" value, the latter are
  now stored in new variables that live alongside the originals (with the
  suffix "_01").  This is convenient since some values need to be
  sed-substituted into the config.mk.in template, which requires "yes" or
  "no", while some need to be written to the bli_config.h.in template,
  which requires "0" or "1".
2016-08-23 13:38:36 -05:00
Devin Matthews
0e1a9821d8 Add configure options and generate bli_config.h automatically.
Options to configure have been added for:
- Setting the internal BLIS and BLAS/CBLAS integer sizes.
- Enabling and disabling the BLAS and CBLAS layers.

Additionally, configure options which require defining macros (the above plus the threading model), write their macros to the automatically-generated bli_config.h file in the top-level build directory. The old bli_config.h files in the config dirs were removed, and any kernel-related macros (SIMD size and alignment etc.) were moved to bli_kernel.h. The Makefiles were also modified to find the new bli_config.h file.

Lastly, support for OMP in clang has been added (closes #56).
2016-04-19 11:44:37 -05:00
Field G. Van Zee
4320b725a1 Use kernel CFLAGS on "ukernels" directories.
Details:
- Updated the top-level Makefile so that the CFLAGS variable designated
  for kernel source code is applied not only to source code in
  directories named "kernels" but source code in any directory that
  contains the substring "kernels", such as "ukernels".
- Formally disabled some code in gen-make-frag.sh script that was already
  effectively disabled. The code was related to handling "noopt" and
  "kernel" directories, which is now handled independently within the
  top-level Makefile without needing to place these source files into
  a spearate makefile variable.
2016-04-14 12:51:29 -05:00
Devin Matthews
0171ad5899 Add icc and clang support for Intel architectures, fixes #47. 2bd036f fixes #49 BTW. 2016-03-28 13:55:06 -05:00
Devin Matthews
76099f20be Add threading option to configure. 2016-03-25 17:22:58 -05:00
Devin Matthews
9452bdb3af Add options for verbose make output and static/shared linking to configure. 2016-03-25 14:59:50 -05:00
Devin Matthews
d226dfa051 Add several changes to the build system.
1) Add -- options.
2) Add -d/--enable-debug option to enable debugging symbols with and without optimization.
3) Allow user to specify CC at configure time, and determine vendor (gcc/icc/etc.). For now configurations enforce a particular vendor.
4) Add make V=[0,1] option to control build verbosity.
2016-03-05 16:18:14 -06:00
Devin Matthews
63e2642390 Make sure that -lrt is linked on Linux. 2016-03-04 13:17:50 -06:00
Zhang Xianyi
4f88c29f9e Detect Intel Broadwell (using Haswell config). 2015-10-14 12:57:50 -05:00
Zhang Xianyi
12ffd568b0 Add Travis CI. 2015-08-22 00:24:28 +08:00
Field G. Van Zee
fdfe14f1e1 Added support for Intel Haswell/Broadwell.
Details:
- Added sgemm and dgemm micro-kernels, which employ 256-bit AVX vectors
  and FMA instructions. (Complex support is currently provided by default
  induced method, 4m1a.)
- Added a 'haswell' configuration, which uses the aforementioned kernels.
- Inserted auto-detection support for haswell configuration in
  build/auto-detect/cpuid_x86.c.
- Modified configure script to explicitly echo when automatic or manual
  configuration is in progress.
- Changed beta scalar in test_gemm.c module of test suite to -1.0 to 0.9.
2015-07-09 13:52:39 -05:00
Zhang Xianyi
4bfd1ce8ca Detect NEON for cortex-a9 and cortex-a15. 2015-04-02 16:40:21 -05:00
Zhang Xianyi
aa6eec4f43 Detect the CPU architecture. Support ARM cores.
Detect the CPU architecture by compiler's predefined macros.
Then, detect the CPU cores.

Support detecting x86 and ARM architectures.
2015-04-02 16:09:02 -05:00
Zhang Xianyi
2947cfb749 Add auto-detecting CPU on configure stage.
e.g.  /Path_to_BLIS/configure auto

Now, it only support detecting x86 CPUs.
2015-04-01 12:24:00 -05:00
Field G. Van Zee
b97fa9a5a7 Minor usage update to build/bump-version.sh. 2014-07-27 18:54:09 -05:00
Field G. Van Zee
7ed415824d Updated copyright headers (continued).
Details:
- Inserted "at Austin" into third clause of license declarations.
  Meant to include this change in previous commit.
2014-07-14 16:14:33 -05:00
Field G. Van Zee
5c2c6c8561 Updated copyright headers to contain "at Austin".
Details:
- Updated copyright headers to include "at Austin" in the name of the
  University of Texas.
- Updated the copyright years of a few headers to 2014 (from 2011 and
  2012).
2014-07-14 16:05:03 -05:00
Field G. Van Zee
570a154581 Comment/formatting updates to build scripts.
Details:
- Minor updates to comments and formatting in bump-version.sh and
  update-version-file.sh scripts.
2014-07-12 17:51:05 -05:00
Field G. Van Zee
b80df0f2cf Added bump-version.sh script to 'build' directory.
Details:
- Added a bash script, bump-version.sh, to aid in incrementing the BLIS
  version string.
2014-06-23 13:52:39 -05:00
Field G. Van Zee
89c76a8a51 Allow building outside source distribution.
Details:
- Modified build system (mostly configure and top-level Makefile) so that
  a user can build a BLIS library outside of the top-level directory of
  the source distribution.
- Added "test" target to Makefile so that the user can run "make test",
  which will compile, link, and run the testsuite binary. This works even
  if the build directory is externally located, thanks to the test suite
  binary's new -g and -o command-line options. Also, when creating the
  test suite via the top-level Makefile, the linking is against the
  local archive, in lib/<configname>, rather than at <install_prefix>/lib.
- Modified testsuite/Makefile so that it links against the library built
  locally, in ../lib/<configname>.
- Added "-lm" to LDFLAGS of most configurations' make_defs.mk.
- Various other cleanups to build system.
2014-01-09 12:08:37 -06:00
Field G. Van Zee
2cb13600f9 Updated year in copyright headers to 2014. 2014-01-03 12:29:13 -06:00
Field G. Van Zee
5b641c3bab Use separate CFLAGS for "kernels" directories.
Details:
- Added a new "special" directory type: any source code within directories
  named "kernels" will be compiled with a separate CFLAGS_KERNELS set of
  compiler flags. This allows the developer to specify a separate set of
  flags (e.g. optimization flags) for compiling kernels while maintaining a
  standard set for regular framework code.
- Fixed a bug in the top-level Makefile that was causing "noopt" code
  to be compiled with the standard set of compilation flags.
- Updated make_defs.mk in reference, flame, and clarksville configurations
  according to above changes.
2013-06-12 16:02:12 -05:00
Field G. Van Zee
d1023bfbc6 Removed build/old directory. 2013-03-22 15:09:59 -05:00