Commit Graph

26 Commits

Author SHA1 Message Date
Devin Matthews
138d403b6b Use -funsafe-math-optimizations and -ffp-contract=fast for all reference kernels when using gcc or clang. (#331) 2019-08-26 18:11:27 -05:00
Field G. Van Zee
0645f239fb Remove UT-Austin from copyright headers' clause 3.
Details:
- Removed explicit reference to The University of Texas at Austin in the
  third clause of the license comment blocks of all relevant files and
  replaced it with a more all-encompassing "copyright holder(s)".
- Removed duplicate words ("derived") from a few kernels' license
  comment blocks.
- Homogenized license comment block in kernels/zen/3/bli_gemm_small.c
  with format of all other comment blocks.
2018-12-04 14:31:06 -06:00
Field G. Van Zee
ad67dc4e34 Communicate cc, cc_vendor to make via config.mk.
Details:
- Historically, the compiler selection has happened statically in the
  various make_defs.mk and would only be overriden by setting CC (either
  prior to running configure or as a configure argument). However, in
  the last couple months, configure has evolved to contain rather
  sophisticated compiler detection logic for the purposes of blacklisting
  sub-configurations. It only makes sense that configure now fully take
  over the responsibility of selecting a compiler from the GNU make side
  of the build system. Thanks to Alex Arslan for his help exposing this
  issue.
- Substitute found_cc into CC in config.mk via configure.
- Set a new variable, CC_VENDOR, in config.mk via substitution from
  configure, and disable the corresponding CC_VENDOR code in common.mk.
- Disabled default compiler selection (usually gcc) in the sub-configs'
  various make_def.mk files.
2018-05-14 18:35:28 -05:00
Field G. Van Zee
45fbe66b3e Fixed libmemkind dependency for x86_64.
Details:
- Removed some old conditional code in config/knl/make_defs.mk that
  added -lmemkind to LDFLAGS if DEBUG_TYPE was not 'sde' and inserted
  code into common.mk that affirmatively filters out -lmemkind from
  LDFLAGS if DEBUG_TYPE is 'sde'. (Thanks to Dave Love for reporting
  this issue.) Other minor cleanups to neighboring code in common.mk.
- Updated CRVECFLAGS in knl/make_defs.mk to be based on -march=knl,
  and then AVX-512 functionality is manually removed via various
  -mno-avx512* flags. Also, make the setting of CRVECFLAGS conditional
  on CC_VENDOR. Similar change to skx/make_defs.mk.
- Comment/whitespace updates.
2018-04-09 14:01:08 -05:00
Field G. Van Zee
bd0276752c Track separate ref kernel flags for each sub-config.
Details:
- Renamed CVECFLAGS variables in sub-configurations' make_defs.mk files
  to CKVECFLAGS.
- Added default defintions of two new make variables to most sub-
  configurations' make_defs.mk files--CROPTFLAGS and CRVECFLAGS--
  which correspond to reference kernel analogues of the CKOPTFLAGS
  and CKVECFLAGS, which track optimization and vectorization flags for
  optimized kernels. Currently, two sub-configurations (knl and skx)
  explicitly set CRVECFLAGS to non-default values (using AVX2 instead of
  AVX-512 for reference kernels. Thanks to Jeff Hammond, whose feedback
  prompted me to make this change (issue #187).
- Changed common.mk so that the get-refkern-cflags-for function returns
  the flags associated with the given sub-configuration's CROPTFLAGS
  and CRVECFLAGS (instead of CKOPTFLAGS and CKVECFLAGS).
2018-04-06 18:51:43 -05:00
Field G. Van Zee
0b3ca3cfb6 Intelligently select compiler for auto-detection.
Details:
- Rewrote code that selects the compiler for the purposes of compiling
  the auto-detection executable. CC (if specified) is tried first. Then
  gcc. Then clang. The absolute fallback is cc. The previous code was
  sort of broken, and seemed to unintentionally always use gcc.
- Moved various configuration-agnostic flags from config/*/make_defs.mk
  files to common.mk. The new mechanism appends the configuration-
  agnostic flags to the various compiler flag variables initialized in
  make_defs.mk. Flags specific to the sub-configuration are still set
  in make_defs.mk.
- Added -Wno-tautological-compare to CMISCFLAGS when clang is in use.
  Also added the flag to the compiler instantiation during configure-
  time hardware detection (when clang is selected).
- Added some missing (but mostly-optional) quotes to configure script.
2018-01-04 20:51:35 -06:00
Field G. Van Zee
6c3ba502a1 Added 'x86_64' sub-config directory.
Details:
- Added missing x86_64 configuration directory, which was intended to be
  part of b7ca580.
- Added -Wfatal-errors compiler warning flag to all configurations so that
  compilation stops after the first error.
- Changed the vectorization flags for intel64 configuration to be compatible
  with 'penryn', the oldest sub-config included in that family.
- Changed the vectorization flags for penryn to target the 'core2'
  microarchitecture and ssse3.
2017-11-21 13:50:53 -06:00
Field G. Van Zee
07c352188b Added "generic" configuration.
Details:
- Added a "generic" configuration that leaves the default blocksizes and
  kernels unchanged. This replaces the older "reference" configuration.
  Updated auto-detect script and code accordingly.
- Added support for generic configuration to arch_t (bli_type_defs.h),
  bli_gks_init() (bli_gks.c), and bli_arch_config.h
- Moved bli_arch_query_id() to bli_arch.c (and prototype to bli_arch.h).
- Whitespace changes to configurations' make_defs.mk files.
2017-10-23 16:59:22 -05:00
Field G. Van Zee
453deb2906 Implemented runtime kernel management.
Details:
- Reworked the build system around a configuration registry file, named
  config_registry', that identifies valid configuration targets, their
  constituent sub-configurations, and the kernel sets that are needed by
  those sub-configurations. The build system now facilitates the building
  of a single library that can contains kernels and cache/register
  blocksizes for multiple configurations (microarchitectures). Reference
  kernels are also built on a per-configuration basis.
- Updated the Makefile to use new variables set by configure via the
  config.mk.in template, such as CONFIG_LIST, KERNEL_LIST, and KCONFIG_MAP,
  in determining which sub-configurations (CONFIG_LIST) and kernel sets
  (KERNEL_LIST) are included in the library, and which make_defs.mk files'
  CFLAGS (KCONFIG_MAP) are used when compiling kernels.
- Reorganized 'kernels' directory into a "flat" structure. Renamed kernel
  functions into a standard format that includes the kernel set name
  (e.g. 'haswell'). Created a "bli_kernels_<kernelset>.h" file in each
  kernels sub-directory. These files exist to provide prototypes for the
  kernels present in those directories.
- Reorganized reference kernels into a top-level 'ref_kernels' directory.
  This directory includes a new source file, bli_cntx_ref.c (compiled on
  a per-configuration basis), that defines the code needed to initialize
  a reference context and a context for induced methods for the
  microarchitecture in question.
- Rewrote make_defs.mk files in each configuration so that the compiler
  variables (e.g. CFLAGS) are "stored" (renamed) on a per-configuration
  basis.
- Modified bli_config.h.in template so that bli_config.h is generated with
  #defines for the config (family) name, the sub-configurations that are
  associated with the family, and the kernel sets needed by those
  sub-configurations.
- Deprecated all kernel-related information in bli_kernel.h and transferred
  what remains to new header files named "bli_arch_<configname>.h", which
  are conditionally #included from a new header bli_arch.h. These files
  are still needed to set library-wide parameters such as custom
  malloc()/free() functions or SIMD alignment values.
- Added bli_cntx_init_<configname>.c files to each configuration directory.
  The files contain a function, named the same as the file, that initializes
  a "native" context for a particular configuration (microarchitecture). The
  idea is that optimized kernels, if available, will be initialized into
  these contexts. Other fields will retain pointers to reference functions,
  which will be compiled on a per-configuration basis. These bli_cntx_init_*()
  functions will be called during the initialization of the global kernel
  structure. They are thought of as initializing for "native" execution, but
  they also form the basis for contexts that use induced methods. These
  functions are prototyped, along with their _ref() and _ind() brethren, by
  prototype-generating macros in bli_arch.h.
- Added a new typedef enum in bli_type_defs.h to define an arch_t, which
  identifies the various sub-configurations.
- Redesigned the global kernel structure (gks) around a 2D array of cntx_t
  structures (pointers to cntx_t, actually). The first dimension is indexed
  over arch_t and the inner dimension is the ind_t (induced method) for
  each microarchitecture. When a microarchitecture (configuration) is
  "registered" at init-time, the inner array for that configuration in the
  2D array is initialized (and allocated, if it hasn't been already). The
  cntx_t slot for BLIS_NAT is initialized immediately and those for other
  induced method types are initialized and cached on-demand, as needed. At
  cntx_t registration, we also store function pointers to cntx_init functions
  that will initialize (a) "reference" contexts and (b) contexts for use with
  induced methods. We don't cache the full contexts for reference contexts
  since they are rarely needed. The functions that initialize these two kinds
  of contexts are generated automatically for each targeted sub-configuration
  from cpp-templatized code at compile-time. Induced method contexts that
  need "stage" adjustments can still obtain them via functions in
  bli_cntx_ind_stage.c.
- Added new functions and functionality to bli_cntx.c, such as for setting
  the level-1f, level-1v, and packm kernels, and for converting a native
  context into one for executing an induced method.
- Moved the checking of register/cache blocksize consistency from being cpp
  macros in bli_kernel_macro_defs.h to being runtime checks defined in
  bli_check.c and called from bli_gks_register_cntx() at the time that the
  global kernel structure's internal context is initialized for a given
  microarchitecture/configuration.
- Deprecated all of the old per-operation bli_*_cntx.c files and removed
  the previous operation-level cntx_t_init()/_finalize() invocations.
  Instead, we now query the gks for a suitable context, usually via
  bli_gks_query_cntx().
- Deprecated support for the 3m2 and 3m3 induced methods. (They required
  hackery that I was no longer willing to support.)
- Consolidated the 1e and 1r packm kernels for any given register blocksize
  into a single kernel that will branch on the schema and support packing
  to both formats.
- Added the cntx_t* argument to all packm kernel signatures.
- Deprecated the local function pointer array in all bli_packm_cxk*.c files
  and instead obtain the packm kernel from the cntx_t.
- Added bli_calloc_intl(), which serves as the calloc-equivalent to to
  bli_malloc_intl(). Useful when we wish to allocate and initialize to
  zero/NULL.
- Converted existing cpp macro functions defined in bli_blksz.h, bli_func.h,
  bli_cntx.h into static functions.
2017-10-18 13:29:32 -05:00
Field G. Van Zee
1f1ec0db93 Updated ar option list used by all configurations.
Details:
- Dropped 'u' from the list of modifiers passed into the library archiver
  ar. Previously, "cru" was used, while now we employ only "cr". This
  change was prompted by a warning observed on Ubuntu 16.04:

    ar: `u' modifier ignored since `D' is the default (see `U')

  This caused me to realize that the default mode causes timestamps to be
  zero, and thus the 'u' option, which causes only changed object files to
  be inserted, is not applicable.
2017-07-19 15:40:48 -05:00
Field G. Van Zee
6e04f9df01 Restored deleted lines from makefile fragments. 2017-05-17 13:03:52 -05:00
Devin Matthews
555ddc30d4 Remove shebangs from makefiles. 2017-05-17 12:27:14 -05:00
J M Dieterich
5fa4e9439c A bunch of shebang fixes from unportable /bin/bash to portable /usr/bin/env bash 2017-05-16 21:50:49 -04:00
Jeff Hammond
c2c91e09b4 never use libm with Intel compilers
Intel compilers include a highly optimized math library (libimf) that
should be used instead of GNU libm.

yes, this change is for ALL targets, including those that are not
supported by the Intel compiler.  there is no harm in doing this, and it
is future-proof in the event that the Intel compilers support other
architectures.
2016-10-25 21:15:26 -07:00
Devin Matthews
76099f20be Add threading option to configure. 2016-03-25 17:22:58 -05:00
Devin Matthews
2bd036f1f9 Fix configuration issue where instruction set flags are not specified for debug builds. 2016-03-25 12:16:49 -05:00
Devin Matthews
d226dfa051 Add several changes to the build system.
1) Add -- options.
2) Add -d/--enable-debug option to enable debugging symbols with and without optimization.
3) Allow user to specify CC at configure time, and determine vendor (gcc/icc/etc.). For now configurations enforce a particular vendor.
4) Add make V=[0,1] option to control build verbosity.
2016-03-05 16:18:14 -06:00
Field G. Van Zee
30833ed71d Minor edits to configurations' make_defs.mk files.
Details:
- Redefined CFLAGS, CFLAGS_NOOPT, and CFLAGS_KERNELS so that CFLAGS_NOOPT
  is defined first and then the other two are defined in terms of
  CFLAGS_NOOPT. This textually cleans up the definitions and makes them a
  little easier to read.
2014-08-06 12:12:03 -05:00
Field G. Van Zee
7ed415824d Updated copyright headers (continued).
Details:
- Inserted "at Austin" into third clause of license declarations.
  Meant to include this change in previous commit.
2014-07-14 16:14:33 -05:00
Field G. Van Zee
5c2c6c8561 Updated copyright headers to contain "at Austin".
Details:
- Updated copyright headers to include "at Austin" in the name of the
  University of Texas.
- Updated the copyright years of a few headers to 2014 (from 2011 and
  2012).
2014-07-14 16:05:03 -05:00
Field G. Van Zee
e8ef696928 Added shared library support to build system.
Details:
- Modified top-level Makefile to support building shared (dynamic)
  libraries.
- Updated most configurations' make_defs.mk files to include necessary
  compiler/linker flags needed by top-level Makefile.
- Note that by default, all configurations presently do NOT build
  shared libraries. To enable, one must change the value of
  BLIS_ENABLE_DYNAMIC_BUILD to 'yes'.
2014-07-02 14:59:27 -05:00
Field G. Van Zee
d27b4f690c Use generic paths for toolchain in POWER7.
Details:
- Fixed issue #4. Thanks to Jeff Hammond for contributing changes.
2014-04-01 12:57:24 -05:00
Field G. Van Zee
89c76a8a51 Allow building outside source distribution.
Details:
- Modified build system (mostly configure and top-level Makefile) so that
  a user can build a BLIS library outside of the top-level directory of
  the source distribution.
- Added "test" target to Makefile so that the user can run "make test",
  which will compile, link, and run the testsuite binary. This works even
  if the build directory is externally located, thanks to the test suite
  binary's new -g and -o command-line options. Also, when creating the
  test suite via the top-level Makefile, the linking is against the
  local archive, in lib/<configname>, rather than at <install_prefix>/lib.
- Modified testsuite/Makefile so that it links against the library built
  locally, in ../lib/<configname>.
- Added "-lm" to LDFLAGS of most configurations' make_defs.mk.
- Various other cleanups to build system.
2014-01-09 12:08:37 -06:00
Field G. Van Zee
2cb13600f9 Updated year in copyright headers to 2014. 2014-01-03 12:29:13 -06:00
Field G. Van Zee
50549a6a31 Changed header install directory to include/blis.
Details:
- Changed top-level Makefile so that headers are installed to
  $(INSTALL_PREFIX)/include/blis/. (Header directories are no longer
  named by version/configuration and then symlinked.)
- Added uninstall targets, including uninstall-old to clean out old
  library archives.
- Added GREP makefile definitions to all configurations' make_defs.mk.
2013-11-17 18:31:27 -06:00
Field G. Van Zee
97aaf220a8 Added new kernels, configurations.
Details:
- Added various micro-kernels for the following architectures:
    Intel MIC
    IBM BG/Q
    IBM Power7
    AMD Piledriver
    Loogson 3A
  and reorganized kernels directory. Thanks to Tyler Smith, Mike Kistler,
  and Xianyi Zhang for contributing these kernels.
- Added configurations corresponding to above architectures, and renamed
  "clarksville" configuration to "dunnington".
2013-09-17 10:51:36 -05:00