Commit Graph

1132 Commits

Author SHA1 Message Date
Field G. Van Zee
3d1a5a7c08 Fixed printf() format overflow.
Details:
- Increased the length of operation name strings passed to xerbla_() in
  the level-2 and level-3 operation _check() functions, found in
  frame/compat/check. This avoids a format specifier overflow warning by
  gcc 7. Thanks to Dave Love for reporting this issue and suggesting the
  fix.
2018-03-16 12:24:07 -05:00
Field G. Van Zee
c73055f028 Return after non-zero info in BLAS checks.
Details:
- Previously, when calling the BLAS compatibility layer, discovering a
  parameter check failure would result in the proper setting of the
  info parameter (printed by xerbla_()), but would also come with an
  immediate abort() rather than a return. This was incorrect behavior
  for two overlapping reasons.
  (1) BLAS should return gracefully to the caller in the event of a
      bad set of parameters, not abort().
  (2) When BLIS was being tested via the BLAS testsuite, BLIS's
      xerbla_() would correctly get preempted/overridden by the
      xerbla_() in the BLAS testsuite, but execution would then
      erroneously continue on to the BLIS implementation with bad
      parameter values.
- The previous issue was addressed by disabling the abort() in BLIS's
  xerbla_(), changing all of the BLAS _check() functions to cpp macros,
  and adding a return statement to the end of each _check() macro's
  "if ( info != 0 )" conditional.
  Thanks to Dave Love for reporting this issue.
2018-03-15 16:08:21 -05:00
Field G. Van Zee
c4f1d18b97 Minor typo fix to printing arch in testsuite.
Details:
- Mistakenly was calling bli_cpuid_query_id() instead of
  bli_arch_query_id() in the recent addition to the testsuite output
  that prints the active sub-configuration. The former function is
  only used for multi-architecture builds, whereas the latter is the
  more general option that also works for single configuration
  (including 'configure auto') builds.
2018-03-14 19:10:09 -05:00
Devin Matthews
8f2fabec80 Make arm32 and arm64 families work. (#176) 2018-03-14 17:43:42 -05:00
Field G. Van Zee
fc6a184251 Print sub-configuration name in testsuite output.
Details:
- Added a line to the testsuite output that prints the name of the
  current/active sub-configuration. This is useful when linking the
  testsuite against multi-configuration builds because it confirms
  the sub-configuration that is actually being employed at runtime.
  Thanks to Devin Matthews for suggesting this feature.
2018-03-14 15:31:17 -05:00
Devin Matthews
9943a899d6 Merge pull request #173 from devinamatthews/dev
Fix Cortex-A9 and Cortex-A15 configs.
2018-03-14 13:27:44 -05:00
Devin Matthews
b1a15ae6ee Use BLIS_H_FLAT 2018-03-14 13:26:44 -05:00
Field G. Van Zee
290dd4a9fe Allow arbitrarily deep configuration families.
Details:
- Updated configure so that configuration families specified in the
  config_registry are no longer constrained as being only one level
  deep. For example, previously the x86_64 family could not be defined
  concisely in terms of, say, intel64 and amd64 families, and instead
  had to be defined as containing "haswell, sandybridge, penryn, zen,
  etc." In other words, families were constrained to only having
  singleton configurations as their members. That constraint is now
  lifted.
- Redefined x86_64 family in config_registry in terms of intel64 and
  amd64.
2018-03-14 13:15:37 -05:00
Devin Matthews
9cee78e006 Fix Cortex-A9 and Cortex-A15 configs.
Tested with QEMU.
2018-03-14 13:09:48 -05:00
Field G. Van Zee
1a3031740f Updates to ARM hardware detection support.
Details:
- Updated/clarified the ARM preprocessor macro branch of bli_cpuid.c.
  Going forward, cortexa57 (64-bit), cortexa15, and cortexa9 (32-bit)
  sub-configurations are supported. However, the functions that detect
  features specific to a15 and a9 are identical, and since a15 is tested
  first, it will always be chosen for arm32 hardware (even if both
  sub-configurations were enabled at configure-time and the library is
  linked and run on an a9). Thus, more work needs to be done to
  distinguish these two.
- Added cpp guard around x86_64 portions of bli_cpuid.c. Now, either
  the x86_64 or ARM code will be compiled (or neither, if neither
  environment is detected).
- In bli_arch_query_id(), call bli_cpuid_query_id() when the
  BLIS_FAMILY_ARM64 or BLIS_FAMILY_ARM32 macros are defined.
- Added arm64 and arm32 configuration families to config_registry.
- Added a note to the arch_t typedef enum in bli_type_defs.h reminding
  the developer to update the string array in bli_arch.c whenever new
  enum values are added or existing values are reordered.
2018-03-13 16:04:40 -05:00
Field G. Van Zee
1442d06886 Fixed misnamed kernels in _cntx_init_cortexa57.c.
Details:
- Changed incorrect kernel function names in bli_cntx_init_cortexa57.c:
    bli_sgemm_cortexa57_asm_8x12 -> bli_sgemm_armv8a_asm_8x12
    bli_dgemm_cortexa57_asm_6x8  -> bli_dgemm_armv8a_asm_6x8
  Thanks to Jacob Gorm Hansen for reporting this issue.
2018-03-11 16:59:50 -05:00
Field G. Van Zee
48da9f5805 Tweaked common.mk, Makefile, skx/knl make_defs.mk.
Details:
- Reorganized linker-related section of common.mk so that LDFLAGS set
  in a sub-configuration's make_defs.mk file will not be immediately
  (and erroneously) overridden by the default values.
- Re-enabled redirected (to file) output of the testsuite when run from
  the top-level Makefile via 'make test'. (For some reason, it was
  commented-out for the non-verbose case.)
- Removed old/unnecessary code from the make_defs.mk files of skx and
  knl sub-configurations.
2018-03-07 12:54:06 -06:00
Field G. Van Zee
8b0475a87d Fixed typo in attempted fix in 1a8350f7.
Details:
- Mistakenly entered 148 as knl mc blocksize for double real when the
  value should have been 144. Thanks to Dave Love for reporting this.
2018-03-06 06:39:44 -06:00
Field G. Van Zee
8912e6886b Fixed missing flags during shared object build.
Details:
- Fixed a bug in common.mk that caused warning, position-independent
  code, miscellaneous, and general preprocessor flags to be omitted
  from the configuration family-specific variables that hold those
  values, as registered by the family's make_defs.mk file. This would
  most obviously manifest when targeting a configuration family such as
  'intel64' while simultaneously configuring for a shared object build,
  as the key '-fPIC' flag would be omitted at compile-time and prevent
  successful linking. Thanks to Dave Love for reporting this bug.
- Other cleanups to common.mk for readability and clarity.
2018-03-05 18:00:45 -06:00
Field G. Van Zee
1a8350f705 Fixed cache blocksize bug in knl configuration.
Details:
- Changed the mc blocksize for double real execution in the knl sub-
  configuration from 160 to 148. The old value was not a multiple of
  mr (which is 24), and thus the safeguards in bli_gks_register_cntx()
  were tripping. Thanks for Dave Love for reporting this issue.
- Switch knl sub-configuration to use default blocksizes for datatypes
  not supported by native kernels.
- Fixed typos in bli_error.c that prevented certain error strings
  (which report maximum cache blocksizes not being multiples of their
  corresponding register blocksize) from properly initializing.
2018-03-05 13:32:00 -06:00
Field G. Van Zee
c09fffa827 Added missing cntx_t* arg in knl packm kernels.
Details:
- Added the missing cntx_t* argument to the function signature of packm
  kernels in kernels/knl/1m/. Thanks to Dave Love for reporting this
  issue.
2018-03-03 13:13:39 -06:00
Field G. Van Zee
1ef9360b1f Enable non-unit vector stride tests by default.
Details:
- Change "vector storage schemes to test" parameter in testsuite's
  input.general file to "cj". This means that both unit stride column
  vectors and non-unit stride column vectors will be tested in
  operations with vector operands (e.g. level-1v, level-1f, level-2).
- Very minor comment (typo) changes to input.operations.
2018-03-01 14:36:39 -06:00
Field G. Van Zee
8c4e55a1a1 Added individual operation overrides in testsuite.
Details:
- Updated the testsuite driver so that setting one or more individual
  operation test switches to "2" in input.operations will enable ONLY
  those operations and disable all others, regardless of the values of
  the section overrides and other operation switches. This makes it
  every easy to quickly test only one or two operations, and equally
  easy to revert back to the previous combination of operation tests.
- Added more comments to input.operations describing the use of
  individual "enable only" overrides.
2018-02-28 17:01:47 -06:00
Field G. Van Zee
34862aed89 Use zen kernels in haswell sub-configuration.
Details:
- Register use of level-1v zen intrinsic kernels for amaxv, axpyv, dotv,
  dotxv, and scalv, as well asl level-1f zen intrinsic kernels for axpyf
  and dotxf. This works because these kernels simply target AVX/AVX2,
  and therefore work without modification on haswell hardware.
- Switch to use of zen microkernels in bli_cntx_init_haswell.c. The zen
  kernels are essentially identical to those used by haswell, except that
  now zen kernels are a bit more up-to-date. In the future, I may
  continue to maintain duplicates, or I may keep the kernels named after
  one architecture (zen or haswell) but used by both sub-configurations.
- In config_registry, enable use of both haswell and zen kernels for the
  haswell sub-configuration. This is necessary in order to make zen
  kernels visible when registering kernels in bli_cntx_init_haswell.c.
- Enable use of assembly-based complex gemm microkernels for zen,
  bli_cgemm_zen_asm_3x8() and bli_zgemm_zen_asm_3x4(), in
  bli_cntx_init_zen.c. This was actually intended for 1681333.
2018-02-28 15:30:14 -06:00
Field G. Van Zee
d9079655c9 CHANGELOG update (0.3.0) 2018-02-23 17:42:48 -06:00
Field G. Van Zee
709f8361eb Version file update (0.3.0) 0.3.0 2018-02-23 17:42:48 -06:00
Field G. Van Zee
3defc7265c Applied 34b72a3 to non-active/unused microkernels.
Details:
- Applied the read-beyond-bounds bugfix in 34b72a3 to other haswell and
  zen kernels (ie: other microtile shapes) which are not used by default.
  This was done mostly in case someone decided to pick up these kernels
  and start using them, not because it affects BLIS's behavior
  out-of-the-box.
2018-02-23 17:38:19 -06:00
Field G. Van Zee
34b72a3517 Fixed obscure read-beyond-bounds bug in sgemm ukrs.
Details:
- Fixed an obscure bug in the bli_sgemm_haswell_asm_6x16 and
  bli_sgemm_zen_asm_6x16 microkernels when the input/output matrix C
  is stored with general stride (ie: both rs and cs are non-unit). The
  bug was rooted in the way those microkernels read from matrix C--
  namely, they used vmovlps/vmovhps instead of movss. By loading two
  floats at a time, even if one of them was treated as junk, the
  assembly code could be written in a more concise manner. However,
  under certain conditions--if m % mr == 0 and n % nr == 0 and the
  underlying matrix is not an internal "view" into a larger matrix--
  this could result in the very last vmovhps of the last (bottom-right)
  microkernel invocation reading beyond valid memory. Specifically, the
  low 32 bits read would always be valid, but the high 32 bits could
  reside beyond the bounds of the array in which the output C matrix is
  contained. To remedy this situation, we now selectively use movss to
  load any element that could be the last element in the matrix.
2018-02-23 16:33:32 -06:00
Field G. Van Zee
5112e1859e Added missing 'restrict' to some kernels' cntx_t*.
Details:
- Added missing 'restrict' keyword to cntx_t* argument of function
  signatures corresponding to level-1v, level-1f, and level-1m kernels.
  This affected bli_l1v_ker_prot.h, bli_l1f_ker_prot.h, and
  bli_l1m_ker_prot.h. (The 'restrict' was already being used to
  qualify cntx_t* arguments for kernels defined in bli_l3_ker_prot.h.)
- Added comments to bli_l1v_ker.h, bli_l1f_ker.h, bli_l1m_ker.h, and
  bli_l3_ukr.h that help explain how those headers function to produce
  kernel prototypes using the prototype macros defined in the files
  mentioned above.
2018-02-23 14:31:26 -06:00
Field G. Van Zee
1fa8af95d8 Merge branch 'rt' 2018-02-21 17:54:02 -06:00
Field G. Van Zee
c084b03b31 Merge branch 'rt' 2018-02-21 17:52:17 -06:00
Field G. Van Zee
16813335bd Merge branch 'amd' into rt
Details:
- Merged contributions made by AMD via 'amd' branch (see summary below).
  Special thanks to AMD for their contributions to-date, especially with
  regard to intrinsic- and assembly-based kernels.
- Added column storage output cases to microkernels in
  bli_gemm_zen_asm_d6x8.c and bli_gemmtrsm_l_zen_asm_d6x8.c. Even with
  the extra cost of transposing the microtile in registers, this is
  much faster than using the general storage case when the underlying
  matrix is column-stored.
- Added s and d assembly-based zen gemmtrsm_u microkernel (including
  column storage optimization mentioned above).
- Updated zen sub-configuration to reflect presence of new native
  kernels.
- Temporarily reverted zen sub-configuration's level-3 cache blocksizes
  to smaller haswell values.
- Temporarily disabled small matrix handling for zen configuration
  family in config/zen/bli_family_zen.h.
- Updated zen CFLAGS according to changes in 1e4365b.
- Updated haswell microkernels such that:
  - only one vzeroupper instruction is called prior to returning
  - movapd/movupd are used in leiu of movaps/movups for double-real
    microkernels. (Note that single-real microkernels still use
    movaps/movups.)
- Added kernel prototypes to kernels/zen/bli_kernels_zen.h, which is
  now included via frame/include/bli_arch_config.h.
- Minor updates to bli_amaxv_ref.c (and to inlined "test" implementation
  in testsuite/src/test_amaxv.c).
- Added early return for alpha == 0 in bli_dotxv_ref.c.
- Integrated changes from f07b176, including a fix for undefined
  behavior when executing the 1m method under certain conditions.
- Updated config_registry; no longer need haswell kernels for zen
  sub-configuration.
- Tweaked marginal and pass thresholds for dotxf.
- Reformatted level-1v, -1f, and -3 amd kernels and inserted additional
  comments.
- Updated LICENSE file to explicitly mention that parts are copyright
  UT-Austin and AMD.
- Added AMD copyright to header templates in build/templates.

Summary of previous changes from 'amd' branch.
- Added s and d assembly-based zen gemm microkernels (d6x8 and d8x6) and
  s and d assembly-based zen gemmtrsm_l microkernels (d6x8).
- Added s and d intrinsics-based zen kernels for amaxv, axpyv, dotv, dotxv,
  and scalv, with extra-unrolling variants for axpyv and scalv.
- Added a small matrix handler to bli_gemm_front(), with the handler
  implemented in kernels/zen/3/bli_gemm_small_matrix.c.
- Added additional logic to sumsqv that first attempts to compute the
  sum of the squares via dotv(). If there is a floating-point exception
  (FE_OVERFLOW), then the previous (numerically conservative) code is
  used; otherwise, the result of dotv() is square-rooted and stored as
  the result. This new implementation is only enabled when FE_OVERFLOW
  is #defined. If the macro is not #defined, then the previous
  implementation is used.
- Added axpyv and dotv standalone test drivers to test directory.
- Added zen support to old cpuid_x86.c driver in build/auto-detect/old.
- Added thread-local and __attribute__-related macros to bli_macro_defs.h.
2018-02-21 17:43:32 -06:00
Devin Matthews
5d03b6e6e1 Fix asm macro include line for KNL. Fixes #167. 2018-02-19 11:31:30 -06:00
Field G. Van Zee
f07b176c84 Fixed an obscure bug in the 1m implementation.
Details:
- Fixed a bug in the way the bli_gemm1m_cntx_ref() function (defined in
  ref_kernels/bli_cntx_ref.c) initializes its context for 1m execution.
  Previously, the function probed the context that was in the process of
  being updated for use with 1m--this context being previously
  initialized/copied from a native context--for its storage preference
  to determine which "variant" (row- or column-oriented) of 1m would be
  needed. However, the _cntx_ref() function was not updating the method
  field of the context until AFTER this query, and the conditional which
  depended on it, had taken place, meaning the storage preference query
  function would mistakenly think the context was for native execution,
  since the context's method field would still be set to BLIS_NAT. This
  would lead it to incorrectly grab the storage preference of the complex
  domain microkernel rather than the corresponding real domain
  microkernel, which could cause the storage preference predicate to
  evaluate to the wrong value, which would lead to the _cntx_ref()
  function choosing the wrong variant. This could lead to undefined
  behavior at runtime. The method is now explicitly set within the
  context prior to calling the storage preference query function.
- Updated comments in frame/ind/oapi/bli_l3_3m4m1m_oapi.c.
- Fixed a typo in the commented-out CFLAGS in config/zen/make_defs.mk,
  which are appropriate for gcc 6.x and newer. (Mistakenly used
  -march=bdver4 instead of -march=znver1.)
2018-02-15 18:36:54 -06:00
Field G. Van Zee
1f94bb7b96 Document how to enable zen-specific instructions.
Details:
- Added as a comment in config/zen/make_defs.mk the list of compiler flags
  that could be added to manually enable the instructions provided by the
  Zen microarchitecture that are not already implied by -march=bdver4.
  This information, along with the previous commit's flags to selectively
  disable Bulldozer instructions no longer present in Zen, was gathered
  from [1]. I hesitate to enable use of these instructions since I don't
  have any Zen hardware to test on yet.
  [1] https://wiki.gentoo.org/wiki/Ryzen
2018-01-19 12:46:53 -06:00
Field G. Van Zee
1e4365b21b Augment zen CFLAGS to prevent illegal instruction.
Details:
- Added various compiler flags (-mno-fma4 -mno-tbm -mno-xop -mno-lwp) so
  that compiling with -march=bdver4 on zen-based architectures does not
  result in an illegal instruction error at runtime. Note: This fix is
  only needed for gcc 5.4; gcc 6.3 or later supports the use of
  -march=znver1, which can be used in lieu of the augmented set of flags
  based on bdver4. Thanks to Nisanth Padinharepatt for reporting this
  error.
2018-01-18 12:03:51 -06:00
Field G. Van Zee
fa74af4e1f Minor labeling update for './configure -c' output.
Details:
- Print the name of the configuration in the output of the
  kernel-to-config map (and chosen pairs list) as a subtle way to remind
  the user that these only apply to the targeted configuration (whereas
  the config list and kernel list are printed without regard to which
  configuration was actually targeted).
2018-01-09 13:43:15 -06:00
Field G. Van Zee
5cdea756c7 Merge branch 'rt' 2018-01-07 19:45:20 -06:00
Devin Matthews
9d8858b5cf Merge pull request #164 from devinamatthews/master
Don't use memkind for skx configuration.
2018-01-07 10:03:25 -06:00
Devin Matthews
f7df64daf6 Don't use memkind for skx configuration. Fixes #163. 2018-01-07 09:37:25 -06:00
Field G. Van Zee
1e7a4896e0 Minor error handling in update-version-file.sh.
Details:
- Added explicit handling of situations when 'git describe --tags'
  returns an error. This command is used by update-version-file.sh
  when deciding whether or not to update the version file prior to
  configuration.
- Removed bli_packm.c and bli_unpackm.c, as they contained no source
  code.
2018-01-05 12:33:48 -06:00
Field G. Van Zee
0b3ca3cfb6 Intelligently select compiler for auto-detection.
Details:
- Rewrote code that selects the compiler for the purposes of compiling
  the auto-detection executable. CC (if specified) is tried first. Then
  gcc. Then clang. The absolute fallback is cc. The previous code was
  sort of broken, and seemed to unintentionally always use gcc.
- Moved various configuration-agnostic flags from config/*/make_defs.mk
  files to common.mk. The new mechanism appends the configuration-
  agnostic flags to the various compiler flag variables initialized in
  make_defs.mk. Flags specific to the sub-configuration are still set
  in make_defs.mk.
- Added -Wno-tautological-compare to CMISCFLAGS when clang is in use.
  Also added the flag to the compiler instantiation during configure-
  time hardware detection (when clang is selected).
- Added some missing (but mostly-optional) quotes to configure script.
2018-01-04 20:51:35 -06:00
Nisanth M P
5a7005dd44 Merge changes in AMD beta release 0.95 into amd branch 2018-01-03 12:37:53 +05:30
Field G. Van Zee
0b9c5127e9 Enabled C99, added stdint.h to auto-detect build.
Details:
- Added "-std=c99" to compiler arguments when building auto-detection
  driver in configure script.
- Added #include <stdint.h> to all three source files needed by auto-
  detection program.
2017-12-23 15:53:44 -06:00
Field G. Van Zee
0ce5e19c31 Reimplemented configure-time hardware detection.
Details:
- Reimplemented the hardware detection functionality invoked when running
  "./configure auto". Previously, a standalone script in build/auto-detect
  that used CPUID was used. However, the script attempted to enumerate all
  models for each microarchitecture supported. The new approach recycles
  the same code used for runtime hardware detection introduced in 2c51356.
  This has two immediate benefits. First, it reduces and consolidates the
  code required to detect microarchitectures via the CPUID instruction.
  Second, it provides an indirect way of testing at configure-time the
  code that is used to detect hardware at runtime. This code is (a) only
  activated when targeting a configuration family (such as intel64 or
  amd64) at configure-time and (b) somewhat difficult to test in
  practice, since it relies on having access to older microarchitectures.
- The above change required placing conditional cpp macro blocks in
  bli_arch.c and bli_cpuid.c which either #include "blis.h" or #include
  a bare-bones set of headers that does not rely on the presence of a
  bli_config.h header. This is needed because bli_config.h has not been
  created yet when configure-time auto-detection takes places.
- Defined a new function in bli_arch.c, bli_arch_string(), which takes
  an arch_t id and returns a pointer to a string that contains the
  lowercase name of the corresponding microarchitecture. This function
  is used by the auto-detection script to printf() the name of the
  sub-configuration corresponding to the detected hardware.
2017-12-23 15:32:03 -06:00
Field G. Van Zee
9804adfd40 Added option to disable pack buffer memory pools.
Details:
- Added a new configure option, --[en|dis]able-packbuf-pools, which will
  enable or disable the use of internal memory pools for managing buffers
  used for packing. When disabled, the function specified by the cpp
  macro BLIS_MALLOC_POOL is called whenever a packing buffer is needed
  (and BLIS_FREE_POOL is called when the buffer is ready to be released,
  usually at the end of a loop). When enabled, which was the status quo
  prior to this commit, a memory pool data structure is created and
  managed to provide threads with packing buffers. The memory pool
  minimizes calls to bli_malloc_pool() (i.e., the wrapper that calls
  BLIS_MALLOC_POOL), but does so through a somewhat more complex
  mechanism that may incur additional overhead in some (but not all)
  situations. The new option defaults to --enable-packbuf-pools.
- Removed the reinitialization of the memory pools from the level-3
  front-ends and replaced it with automatic reinitialization within the
  pool API's implementation. This required an extra argument to
  bli_pool_checkout_block() in the form of a requested size, but hides
  the complexity entirely from BLIS. And since bli_pool_checkout_block()
  is only ever called within a critical section, this change fixes a
  potential race condition in which threads using contexts with different
  cache blocksizes--most likely a heterogeneous environment--can check
  out pool blocks that are too small for the submatrices it wishes to
  pack. Thanks to Nisanth Padinharepatt for reporting this potential
  issue.
- Removed several functions in light of the relocation of pool reinit,
  including bli_membrk_reinit_pools(), bli_memsys_reinit(),
  bli_pool_reinit_if(), and bli_check_requested_block_size_for_pool().
- Updated the testsuite to print whether the memory pools are enabled or
  disabled.
2017-12-21 19:22:57 -06:00
Field G. Van Zee
107801aaae Merge branch 'master' into selfinit 2017-12-18 16:29:28 -06:00
Field G. Van Zee
0084531d3e Updated flatten-headers.py for python3.
Details:
- Modifed flatten-headers.py to work with python 3.x. This mostly
  amounted to removing print statements (which I replaced with calls
  to my_print(), a wrapper to sys.stdout.write()). Thanks to Stefan
  Husmann for pointing out the script's incompatibility with python 3.
- Other minor changes/cleanups.
2017-12-17 18:58:25 -06:00
Field G. Van Zee
90b11b79c3 Modest performance boost to flatten-headers.py.
Details:
- Updated flatten-headers.py to pre-compile the main regular expression
  used to isolate #include directives and the header filenames they
  reference. The compiled regex object is then used over and over on
  each header file in the tree of referenced headers. This appears to
  have provided a 1.7-2x performance increase in the best case.
- Other minor tweaks, such as renaming the main recursive function from
  replace_pass() to flatten_header().
2017-12-17 17:34:32 -06:00
Field G. Van Zee
99dee87f30 Reimplemented flatten-headers.sh in python.
Details:
- Added flatten-headers.py, a python implementation of the bash script
  flatten-headers.sh. The new script appears to be 25-100x faster,
  depending on the operating system, filesystem, etc. The python script
  abides by the same command line interface as its predecessor and
  targets python 2.7 or later. (Thanks to Devin Matthews for suggesting
  that I look into a python replacement for higher performance.)
- Activated use of flatten-headers.py in common.mk via the FLATTEN_H
  variable.
- Made minor tweaks to flatten-headers.sh such as spelling corrections
  in comments.
2017-12-17 16:47:27 -06:00
Field G. Van Zee
d9c0574599 Allow travis failures of OS X builds that run testsuite.
Details:
- Added an allowance for OS X builds that run the testsuite to fail.
  There seems to be an issue with 1m when running in Travis CI under
  OS X and clang, but only in double-precision. Haven't been able to
  reproduce the error on my own, and thus, I can't debug it. (Hopefully
  it is simply a version-specific compiler bug.)
2017-12-14 17:13:42 -06:00
Field G. Van Zee
86cd23b737 Fixed testsuite Makefile brokenness from 9091a207.
Details:
- Fixed a makefile error encountered when building the testsuite directly
  in its directory (as opposed to indirectly via 'make test'). The fix
  involves introducing a new variable, BUILD_PATH, alongside the existing
  DIST_PATH variable. By default, BUILD_PATH is set to the current
  directory, and is overridden by other Makefiles used by, for example,
  the testsuite and standalone test drivers in testsuite or test,
  respectively.
- Some files/directories in common.mk were redefined in terms of
  BUILD_DIR, such as the locations of config.mk file and the intermediate
  include directory.
2017-12-14 15:47:41 -06:00
Field G. Van Zee
6a3a8924c0 Temporarily show Makefile's testsuite output.
Details:
- Disabled redirection of testsuite output for 'test' target. This is
  part of an attempt to debug a segmentation fault on OS X via Travis.
2017-12-14 13:20:02 -06:00
Field G. Van Zee
9a01080dd4 Merge branch 'master' into selfinit 2017-12-14 11:27:19 -06:00
Field G. Van Zee
a32e8a47c0 Added an exclusion to .travis.yml.
Details:
- Added exclusion for out-of-tree builds on OS X (clang).
2017-12-13 16:31:36 -06:00