Commit Graph

901 Commits

Author SHA1 Message Date
Field G. Van Zee
70a64432ee Fixed off-by-one indexing in bli_cpuid.c.
Details:
- In bli_cpuid.c, fixed an off-by-one indexing statement in vpu_count()
  whereby a string-terminating NULL character, '\0', is written beyond
  the bounds of the model_num string.
- Minor whitespace and formatting edits to bli_cpuid.c.
2017-12-11 13:14:20 -06:00
Field G. Van Zee
87978f6261 Fixed broken out-of-tree builds since 52f9e6f.
Details:
- Added missing $(DIST_PATH)/ prefix to relative path to flatten-headers.sh
  script in common.mk so that the script could be found during out-of-tree
  builds. Thanks to Devin Matthews for reporting this bug.
2017-12-11 12:49:03 -06:00
Field G. Van Zee
513ef4d040 Various typecasting fixes, mis-typed enums, etc.
Details:
- Fixed implicit typecasting of conj_t to trans_t in bli_[un]packm_cxk.c.
- Properly typecast integer arguments to match format specifier in various
  calls to printf() in bli_l3_thrinfo.c, bli_cntx.c, bli_pool.c, and
  bli_util_oapi.c.
- Fixed "unsigned less-than-comparison with zero" checks in bli_check.c,
  bli_cntx.h.
- Fixed mis-typed enums in bli_cntx.c (e.g., l1mkr_t that should have been
  l1fkr_t or l1vkr_t).
- Fixed instances of opid_t value BLIS_GEMM that should have been l3ukr_t
  value BLIS_GEMM_UKR in bli_cntx_ref.c.
- NOTE: These issues were identified via compiler warnings when building
  BLIS with clang on a rather old installation of OS X:
    $ clang --version
    Apple LLVM version 5.0 (clang-500.2.79) (based on LLVM 3.3svn)
    Target: x86_64-apple-darwin15.2.0
    Thread model: posix
2017-12-11 12:35:59 -06:00
Field G. Van Zee
b150870397 Removed most "old" directories.
Details:
- Removed the vast majority of directories named "old", which contained
  deprecated code that I wasn't quite ready to jettison from the source
  tree.
2017-12-08 16:08:41 -06:00
Field G. Van Zee
270c65985d Modified bli_getopt() for thread-safety.
Details:
- Changed the interface of bli_getopt() to take a new argument, a getopt_t
  struct, that stores the values of optarg, optind, opterr, and optopt,
  and updated the implementation accordingly. (Previously,  these
  variables were assumed to be global.)
- Added a function for initializing a getopt_t struct.
- Changed test_libblis.c--currently the only consumer of bli_getopt()--to
  utilize the new getopt_t state object.
2017-12-08 15:21:18 -06:00
Field G. Van Zee
ce4d8fabc2 Merge branch 'master' of github.com:flame/blis 2017-12-07 17:36:44 -06:00
Field G. Van Zee
39be59f2a8 Replaced several macros with static function APIs.
Details:
- Reimplemented several sets of get/set-style preprocessor macros with
  static functions, including those in the following frame/base headers:
  auxinfo, cntl, mbool, mem, membrk, opid, and pool. A few headers in
  frame/thread were touched as well: mutex_*, thrcomm, and thrinfo.
2017-12-07 17:35:20 -06:00
dnp
e05a8dfa7c Merge branch 'rt' 2017-12-06 16:45:24 -06:00
dnp
4423e33dc5 Adding SKX kernels and configuration. 2017-12-06 16:35:03 -06:00
Field G. Van Zee
79507337e1 Various checks to ensure that arch_t id is in range.
Details:
- Expanded checking of the arch_t id in bli_gks.c--either passed in from
  the caller or as returned from bli_arch_query_id()--against the expected
  range of id values. Thanks to Devangi Parikh for suggesting these
  additional sanity checks.
2017-12-06 16:21:35 -06:00
Field G. Van Zee
fde7c1126c Added 'uninstall-old-headers' target to Makefile.
Details:
- Defined a new 'uninstall-old-headers' target that allows users of BLIS to
  uninstall no-longer-needed headers left over from previous installations.
- Fixed the 'uninstall-old' target so that it will install both .a and .so
  libraries.
- Renamed 'uninstall-old' to 'uninstall-old-libs'.
- Added 'uninstall-old' target (different from previous 'uninstall-old'
  target) that combines 'uninstall-old-libs' and 'uninstall-old-headers'.
2017-12-04 16:11:01 -06:00
Field G. Van Zee
d4ee770bde Create/install monolithic cblas.h.
Details:
- When CBLAS is enabled at configure-time, BLIS now creates a monolithic
  cblas.h using the same flatten-header.sh script that was recently
  introduced for creating monolithic blis.h header files. The top-level
  Makefile will also install this cblas.h file into the install prefix
  alongside blis.h when the 'install' target is invoked. The two header
  files are compatible with one another. Regardless whether the user's
  source #includes cblas.h, both blis.h and cblas.h, or just blis.h,
  the user will get the CBLAS function prototypes and enums, as expected.
2017-12-04 14:53:43 -06:00
Field G. Van Zee
52f9e6f1b6 Merge branch 'rt' 2017-12-01 12:28:09 -06:00
Field G. Van Zee
21360dd8e2 Fixed cntx_t packm query when ker_id > _NUM_PACKM_KERS.
Details:
- Fixed a subtle bug in bli_cntx_get_[un]packm_ker_dt() in which the
  function fails to return NULL when passed a kernel id argument that is
  equal to or beyond BLIS_NUM_[UN]PACKM_KERS. Instead, the function was
  attempting to index into the cntx_t's packm kernel array, which resulted
  in undefined behvaior. Thanks to Devangi Parikh for finding this bug.
2017-11-29 14:11:34 -06:00
Field G. Van Zee
244a6f4e66 Fixed POSIX sed non-compliance in flatten-header.sh.
Details:
- Changed GNU usage of 'i' and 'a' sed commands used in flatten-header.sh
  to POSIX-compliant usage that will work on OS X's sed.
2017-11-28 17:48:48 -06:00
Field G. Van Zee
4507862167 Generate/compile with/install monolithic blis.h.
Details:
- Rewrote monolithify-header.sh (and renamed to flatten-header.sh) so that
  headers are inserted recursively. This improves performance by a factor
  of 3-4x.
- Modified configure to create an 'include/<configname>' directory in which
  make can create a monolithic header.
- Modified the top-level Makefile so that a monolithic header is generated
  unconditionally prior to compilation (stored in include/<configname>) and
  so that the single header is installed instead of the 450 or so header
  files that reside throughout the framework source tree.
- Added "include/*/*.h" to .gitignore file.
- Removed some pnacl/emscripten leftovers that I intended to include in
  a1caeba (mostly in testsuite/Makefile).
- Trivial comment changes to frame/include/bli_f2c.h.
2017-11-28 15:16:22 -06:00
Field G. Van Zee
1f30b1301b Added missing framework support for x86_64 family.
Details:
- Added support for the x86_64 configuration family to bli_arch.c and
  bli_arch_config.h. Thanks to Johannes Dieterich for reporting this
  issue.
- Bumped the default value for BLIS_SIMD_NUM_REGISTERS from 16 to 32 and
  the default value for BLIS_SIMD_SIZE from 32 to 64. This will support
  configuration families that include Skylake and newer processors without
  any supported needed in the bli_family_*.h file. The semantics of these
  values have always been "maximum" and not exact values; comments in
  bli_kernel_macro_defs.h and the github wiki have been adjusted
  accordingly.
2017-11-25 16:54:26 -06:00
Field G. Van Zee
9f39806c4e Fixed a bug in e31f0b3/b131b9a.
Details:
- Erroneously placed the "don't overwrite existing blocksize" logic in
  bli_blksz_init*() rather than in bli_cntx_set_blkszs(). It belongs in
  the latter because that function copies blocksizes as-is from the
  blksz_t function argument to the appropriate field in the cntx_t. If
  the blksz_t was previously initialized selectively, based on the sign
  of the blocksize value passed into bli_blksz_init*(), that just leaves
  some fields possibly uninitialized (with garbage values), which
  definitely will not work.
- The aforementioned logic has been moved to bli_cntx_set_blkszs() via
  a new function bli_blksz_copy_if_pos(), which selectively copies only
  the blocksizes that are greater than zero.
2017-11-21 16:03:56 -06:00
Field G. Van Zee
b131b9a025 Updated configs to omit setting some blocksizes.
Details:
- Employ the new semantics of bli_blksz_init*() in e31f0b3 in various
  sub-configurations' bli_cntx_init_*() functions by passing in 0 for
  register and cache blocksizes that correpond to gemm microkernel
  datatypes that were not registered, allowing the default values
  set by the bli_cntx_init_*_ref() function call to remain.
2017-11-21 14:30:26 -06:00
Field G. Van Zee
499a4c002f Merge branch 'rt' of github.com:flame/blis into rt 2017-11-21 14:25:08 -06:00
Field G. Van Zee
e31f0b3e2d Subtle update to bli_blksz_init*() API.
Details:
- Updated the semantics of bli_blksz_init() and bli_blksz_init_ed() so
  that non-positive blocksize values are ignored entirely. This provides
  an easy way to indicate that certain existing values should not be
  touched by the update. Thanks to Devangi Parikh for feedback that led
  to these changes.
2017-11-21 14:21:25 -06:00
Field G. Van Zee
6c3ba502a1 Added 'x86_64' sub-config directory.
Details:
- Added missing x86_64 configuration directory, which was intended to be
  part of b7ca580.
- Added -Wfatal-errors compiler warning flag to all configurations so that
  compilation stops after the first error.
- Changed the vectorization flags for intel64 configuration to be compatible
  with 'penryn', the oldest sub-config included in that family.
- Changed the vectorization flags for penryn to target the 'core2'
  microarchitecture and ssse3.
2017-11-21 13:50:53 -06:00
Field G. Van Zee
25eee3cc49 Added a dummy file to kernels/generic.
Details:
- Added a dummy file to kernels/generic, which was previously empty, so
  that git would begin tracking the otherwise-empty directory. This
  directory's existence is necessary for proper execution of configure
  for any configuration family that contains the 'generic'
  sub-configuration. Thanks to Johannes Dieterich for reporting the
  issue that led to this fix.
2017-11-21 12:34:20 -06:00
Field G. Van Zee
ef024ce4ca More tweaks to monolithify-header.sh
Details:
- Further fixes monolithify-header.sh script.
- Removed unnecessary #include "blis.h" from frame/3/bli_l3_packm.h.
2017-11-20 18:08:29 -06:00
Field G. Van Zee
5028e7dec2 Second attempt to implement travis_wait.
Details:
- Corrected accidental misplacement of the travis_wait prefix (on the
  wrong line of the .travis.yml file) in commit 13e5d91.
2017-11-20 17:00:37 -06:00
Field G. Van Zee
13e5d9107b Added travis_wait prefix to testsuite via Travis.
Details:
- It appears that Travis CL has implemented a new policy that results in
  a test failing if it does not produce any output for more than 10
  minutes. (Two test instances are now failing in Travis despite the most
  recent commit not affecting the library or testsuite.) This issue can
  be worked around by executing the test run via travis_wait, which takes
  an optional time parameter. This commit attempts to use 'travis_wait 30'
  in the .travis.yml file to prevent the early failure at 10 minutes.
2017-11-20 15:57:06 -06:00
Field G. Van Zee
a1caeba0ea Removed pnacl, emscripten support from Makefile. 2017-11-20 13:31:20 -06:00
Field G. Van Zee
9df6dda9ec Improvements, bugfixes to monolithify-header.sh. 2017-11-18 19:03:26 -06:00
Field G. Van Zee
21d26201f9 Merge branch 'rt' of github.com:flame/blis into rt 2017-11-18 14:16:53 -06:00
Field G. Van Zee
43baa3b327 Removed unnecessary flags for generic config.
Details:
- Removed -D_POSIX_C_SOURCE=200112L and -m64 flags from make_defs.mk file
  of generic sub-configuration. These flags are generally not necessary,
  and particularly not desirable for the generic configuration since they
  unnecessarily restrict the environments in which the configuration can
  be built.
2017-11-18 14:14:44 -06:00
iotamudelta
b7ca580618 [WIP] Add x86 and x86_64 processor families. (#154)
* Add x86 and x86_64 processor families.
* Use generic config as fallback for more families.

After discussion with fgvanzee, a) it's "generic" and 2) use it for all the families as a fallback. Goal is that if a specific CPU is not yet supported by a family (say a new Intel microarchitecture on x86_64), it'll fall through to still work with the slower "generic" kernels
2017-11-18 13:56:05 -06:00
Field G. Van Zee
870597d166 Added bash script for creating monolithic headers.
Details:
- Added a new script, monolithify-header.sh, to the 'build' directory.
  This script recursively replaces all #include directives in a selected
  file with the contents of the header files referenced by each directive.
  The idea is to "flatten" a tree of .h files into a single file, with
  the script acting as a C preprocessor that only processes #include
  directives.
2017-11-17 17:06:42 -06:00
Field G. Van Zee
c76f77f4cc Removed unnecessary #include "blis.h" from header.
Details:
- Removed an errant #include "blis.h directive from bli_cntx_ind_stage.h.
  The generaly policy is that no header file in BLIS should include
  blis.h. This will be important in the near future when using a tool to
  recursively create a monolithic blis.h file from its consitutent
  headers.
2017-11-17 15:10:52 -06:00
Field G. Van Zee
2bb9bc6e95 Miscellaneous tweaks to gks, rt functionality.
Details:
- Updated bli_cpuid_query_id() so that BLIS_ARCH_GENERIC is always returned
  if the hardware fails to test positive for any supported sub-configuration.
- Defined bli_gks_init_ref_cntx(), which will call the context initialization
  function bli_cntx_init_configname() for the sub-configuration 'configname'
  associated with the arch_t id returned by bli_arch_query_id(). This makes
  initializing a reference context easy for experts who wish to construct
  those contexts.
2017-11-17 13:50:14 -06:00
Field G. Van Zee
d5bf79e50b Miscellaneous tweaks and fixes.
Details:
- Fixed incorrect calling sequence in bli_cntx_init_knl.c--an instance of
  bli_blksz_init_easy() that should have been bli_blksz_init().
- Fixed a bug in code that is supposed to output the list of sub-directories
  in the 'config' directory when configure script is run with no arguments.
- Expanded the output of "make showconfig" to include more info from config.mk.
- Minor changes to build/auto-detect/cpuid_x86.c, mostly in preparation for
  someone to add excavator and zen support.
- Added a link to the ConfigurationHowTo wiki to config_registry.
- Other minor tweaks to configure.
2017-11-13 14:24:29 -06:00
Field G. Van Zee
673e518403 Merge branch 'rt' of github.com:flame/blis into rt 2017-11-01 17:37:42 -05:00
Field G. Van Zee
2c51356a8b Implemented runtime hardware detection via cpuid.
Details:
- Added runtime support for selecting an appropriate arch_t value based
  on the results of the cpuid instruction (for x86_64). This allows
  deferral of choosing a context (kernels, blocksizes, etc.) until
  runtime, which allows BLIS to be built with support for multiple
  microarchitectures. Currently, only amd64 and intel64 configurations
  are registered in the config_registry; however, one could create
  custom configuration families to support arbitrary sets of x86_64
  microarchitectures.
- Current Intel microarchitectures supported via cpuid are knl, haswell,
  sandybridge, and penryn.
- Current AMD microarchitectures supported via cpuid are: zen, excavator,
  steamroller, piledriver, and bulldozer.
2017-11-01 17:37:02 -05:00
Field G. Van Zee
ab57b97904 Revert to default SIMD alignment for bulldozer.
Details:
- Removed the default-overriding #define of BLIS_SIMD_ALIGN_SIZE set in
  config/bulldozer/bli_kernel.h. Not sure where this value came from, but
  it would seem to allow for insufficient starting address alignment for
  any matrices created via bli_malloc_user(), such as via
  bli_obj_create(). Thanks to Rene Sitt for reporting the behavior that
  led us to this bug.
- This commit is a manual patch of the same fix made to the 'rt' branch
  in 8f150f2.
2017-11-01 11:51:41 -05:00
Field G. Van Zee
8f150f28a6 Revert to default SIMD alignment for bulldozer.
Details:
- Removed the default-overriding #define of BLIS_SIMD_ALIGN_SIZE set in
  bli_family_bulldozer.h. Not sure where this value came from, but it
  would seem to allow for insufficient starting address alignment for
  any matrices created via bli_malloc_user(), such as via
  bli_obj_create(). Thanks to Rene Sitt for reporting the behavior that
  led us to this bug.
2017-11-01 11:41:45 -05:00
Field G. Van Zee
e3f10557ca Use perl for some substitution for OS X compatibility.
Details:
- Discovered that sed commands where the replacement string contains '\n'
  are problematic with the version of sed present in OS X. For these cases
  cases in the configure script, we instead use 'perl -pe' for
  search-and-replace functionality.
- Various other minor comment/whitespace tweaks to configure.
- Removed remaining lines of code related to setting/checking variables to
  track "unregistered" configurations.
2017-10-30 13:37:54 -05:00
Field G. Van Zee
dd45cfdfc3 Merge branch 'master' into rt 2017-10-30 12:23:05 -05:00
Devin Matthews
f60c827ba9 Fix CVECFLAGS for bulldozer config. 2017-10-30 10:04:42 -05:00
Field G. Van Zee
3e4f42a4d2 Typecast l1mkr_t enum value prior to comparison.
Details:
- Typecast l1mkr_t enum value in bli_cntx.h to guint_t before testing for
  out-of-range value. This is an attempt to pacify a strange warning from
  clang on OS X that is seemingly the result of the following compiler
  warning flag:
    -Wtautological-constant-out-of-range-compare
2017-10-27 11:41:37 -05:00
Field G. Van Zee
aec6e038d9 Removed associative arrays from configure.
Details:
- Implemented a replacement for associative arrays in the configure script
  that does not utilize arrays, and therefore works in pre-4.0 versions of
  bash. (It appears that Mac OS X will be stuck with version 3.2 indefinitely
  due to bash switching to the GPL 3.0 license starting with version 4.0.)
2017-10-26 16:12:36 -05:00
Field G. Van Zee
07c352188b Added "generic" configuration.
Details:
- Added a "generic" configuration that leaves the default blocksizes and
  kernels unchanged. This replaces the older "reference" configuration.
  Updated auto-detect script and code accordingly.
- Added support for generic configuration to arch_t (bli_type_defs.h),
  bli_gks_init() (bli_gks.c), and bli_arch_config.h
- Moved bli_arch_query_id() to bli_arch.c (and prototype to bli_arch.h).
- Whitespace changes to configurations' make_defs.mk files.
2017-10-23 16:59:22 -05:00
Field G. Van Zee
c1a98d6f70 Minor update to .travis.yml file. 2017-10-23 14:24:41 -05:00
Field G. Van Zee
75b9383f01 Minor header renaming ahead of bli_arch.c.
Details:
- Renamed the various configurations' "bli_arch_<configname>.h" header files
  (replacing "arch" with "family") to free up the 'bli_arch' namespace for a
  different purpose (hardware detection).
- Renamed "bli_arch.h" and "bli_arch_pre_macro_defs.h" in frame/include to
  "bli_arch_config.h" and "bli_arch_config_pre.h", respectively.
2017-10-20 16:41:22 -05:00
Field G. Van Zee
482af51add Fixed 'make test' target from top-level Makefile.
Details:
- Updated the top-level Makefile's build rule for testsuite object files to
  properly obtain CFLAGS via get-frame-cflags-for() function instead of
  simply using the $(CFLAGS) variable (which is empty). This means that
  'make test' should now work as expected.
2017-10-20 15:44:26 -05:00
Field G. Van Zee
3c269f700d Makefile updates for test drivers, testsuite.
Details:
- Fixed semi-broken testsuite Makefile and very-broken test driver Makefiles,
  as well as those for test/3m4m, test/thread_ranges, and test/exec_sizes
  sub-directories.
- Factored out much of the top-level Makefile into common.mk. A Makefile
  needs only set DIST_PATH to the relative path to the top level of the
  BLIS source distribution before including common.mk in order to acquire
  all of the definitions typically needed in a Makefile that tests BLIS.
2017-10-20 13:57:21 -05:00
Field G. Van Zee
0557189d46 Minor updates to .travis.yml, configure script. 2017-10-18 15:05:27 -05:00