Commit Graph

18 Commits

Author SHA1 Message Date
Devin Matthews
138d403b6b Use -funsafe-math-optimizations and -ffp-contract=fast for all reference kernels when using gcc or clang. (#331) 2019-08-26 18:11:27 -05:00
Field G. Van Zee
70f12f209b Changed unsafe-loop to unsafe-math optimizations.
Details:
- Changed -funsafe-loop-optimizations (re-)introduced in 7690855 for
  make_defs.mk files' CRVECFLAGS to -funsafe-math-optimizations (to
  account for a miscommunication in issue #300). Thanks to Dave Love
  for this suggestion and Jeff Hammond for his feedback on the topic.
2019-02-20 16:10:10 -06:00
Field G. Van Zee
7690855c51 Restored -funsafe-loop-optimizations to subconfigs.
Details:
- Restored use of -funsafe-loop-optimizations in the definitions of
  CRVECFLAGS (when using gcc), but only for sub-configurations (and
  not configuration families such as amd64, intel64, and x86_64).
  This more or less reverts 5190d05 and 6cf1550.
2019-02-18 19:16:01 -06:00
Field G. Van Zee
6cf1550491 Removed -funsafe-loop-optimizations from all configs.
Details:
- Error persists. Removed -funsafe-loop-optimizations from all remaining
  sub-configurations.
2019-02-18 17:29:51 -06:00
Field G. Van Zee
6a014a3377 Standardized optimization flags in make_defs.mk.
Details:
- Per Dave Love's recommendation in issue #300, this commit defines
    COPTFLAGS := -03
  and
    CRVECFLAGS := $(CKVECFLAGS) -funsafe-loop-optimizations
  in the make_defs.mk for all Intel- and AMD-based configurations.
2019-02-18 14:52:29 -06:00
Field G. Van Zee
26c5cf495c Fixed bug in skx subconfig related to bdd46f9.
Details:
- Fixed code in the skx subconfiguration that became a bug after
  committing bdd46f9. Specifically, the bli_cntx_init_skx() function
  was overwriting default blocksizes for the scomplex and dcomplex
  microkernels despite the fact that only single and double real
  microkernels were being registered. This was not a problem prior to
  bdd46f9 since all microkernels used dynamically-queried (at runtime)
  register blocksizes for loop bounds. However, post-bdd46f9, this
  became a bug because the reference ukernels for scomplex and dcomplex
  were written with their register blocksizes hard-coded as constant
  loop bounds, which conflicted the the erroneous scomplex and dcomplex
  values that bli_cntx_init_skx() was setting in the context. The
  lesson here is that going forward, all subconfigurations must not set
  any blocksizes for datatypes corresponding to default/reference
  microkernels. (Note that a blocksize is left unchanged by the
  bli_cntx_set_blkszs() function if it was set to -1.)
2019-01-24 18:49:31 -06:00
Field G. Van Zee
0645f239fb Remove UT-Austin from copyright headers' clause 3.
Details:
- Removed explicit reference to The University of Texas at Austin in the
  third clause of the license comment blocks of all relevant files and
  replaced it with a more all-encompassing "copyright holder(s)".
- Removed duplicate words ("derived") from a few kernels' license
  comment blocks.
- Homogenized license comment block in kernels/zen/3/bli_gemm_small.c
  with format of all other comment blocks.
2018-12-04 14:31:06 -06:00
Field G. Van Zee
53a9ab1c85 Renamed thread auto-factorization macro constants.
Details:
- Renamed the following C preprocessor macros whose fallback/default
  values are specified within frame/include/bli_kernel_macro_defs.h:

    BLIS_DEFAULT_MR_THREAD_MAX  -> BLIS_THREAD_MAX_IR
    BLIS_DEFAULT_NR_THREAD_MAX  -> BLIS_THREAD_MAX_JR
    BLIS_DEFAULT_M_THREAD_RATIO -> BLIS_THREAD_RATIO_M
    BLIS_DEFAULT_N_THREAD_RATIO -> BLIS_THREAD_RATIO_N

- Renamed the above cpp macro overrides within the knl, skx, and zen
  sub-configurations, as well as invocations of those macros in
  bli_rntm.c.
- Moved config/zen/bli_kernel.h to an 'old' directory as it is no longer
  used by any code within BLIS.
2018-10-10 15:11:09 -05:00
Field G. Van Zee
e249a00a82 Imported skx dgemm ukernel from skx-redux branch.
Details:
- Added the new bli_dgemm_skx_asm_16x14.c microkernel from the skx-redux
  branch, along with appropriate blocksizes in bli_cntx_init_skx.c and
  a prototype in bli_kernels_skx.h. (Devin has not yet written the
  sgemm analague, so for now we will continue using the older sgemm
  ukernel.)
- Updated frame/include/bli_x86_asm_macros.h with a minor change that
  was present within the skx-redux branch.
2018-09-10 16:48:35 -05:00
Field G. Van Zee
4fa4cb0734 Trivial comment header updates.
Details:
- Removed four trailing spaces after "BLIS" that occurs in most files'
  commented-out license headers.
- Added UT copyright lines to some files. (These files previously had
  only AMD copyright lines but were contributed to by both UT and AMD.)
- In some files' copyright lines, expanded 'The University of Texas' to
  'The University of Texas at Austin'.
- Fixed various typos/misspellings in some license headers.
2018-08-29 18:06:41 -05:00
Field G. Van Zee
ad67dc4e34 Communicate cc, cc_vendor to make via config.mk.
Details:
- Historically, the compiler selection has happened statically in the
  various make_defs.mk and would only be overriden by setting CC (either
  prior to running configure or as a configure argument). However, in
  the last couple months, configure has evolved to contain rather
  sophisticated compiler detection logic for the purposes of blacklisting
  sub-configurations. It only makes sense that configure now fully take
  over the responsibility of selecting a compiler from the GNU make side
  of the build system. Thanks to Alex Arslan for his help exposing this
  issue.
- Substitute found_cc into CC in config.mk via configure.
- Set a new variable, CC_VENDOR, in config.mk via substitution from
  configure, and disable the corresponding CC_VENDOR code in common.mk.
- Disabled default compiler selection (usually gcc) in the sub-configs'
  various make_def.mk files.
2018-05-14 18:35:28 -05:00
Field G. Van Zee
60366a3fab Updates to knl kernels and related code.
Details:
- Imported the 24x16 knl sgemm microkernel (and its corresonding spackm
  kernel) from TBLIS and enabled its use in the knl sub-config. Also
  Added sgemm microkernel prototype to bli_kernels_knl.h.
- Updated dgemm and dpackm microkernels from TBLIS, which included an
  important change regarding the offsets array (changed from extern
  declaration to static declaration/definition).
- Activated use of level-1v and -1f zen kernels in skx and knl
  sub-configs.
- Removed some old macros no longer needed in bli_family_skx.h now that
  libmemkind support exists in configure.
- Moved bli_avx512_macros.h to frame/include and adjusted #includes in
  skx and knl kernels accordingly.
- Moved unused kernels in kernels/knl/3 to kernels/knl/3/other
  directory.
- Fixed a minor bug in the 'make' output per compile when verboseness
  is not turned on. The rule-generating function 'make-kernel-rule' was
  previously passing in the name of the config, rather than the name of
  the kernel set returned by get-config-for-kset, which could give
  misleading information to the user when the kconfig_map mapped a
  kernel set to a sub-configuration that did not share the same name.
  (This didn't affect the CFLAGS that were actually used.)
- Updated test/3m4m/Makefile, removing acml targets and renaming the
  remaining targets.
2018-04-16 18:46:21 -05:00
Field G. Van Zee
45fbe66b3e Fixed libmemkind dependency for x86_64.
Details:
- Removed some old conditional code in config/knl/make_defs.mk that
  added -lmemkind to LDFLAGS if DEBUG_TYPE was not 'sde' and inserted
  code into common.mk that affirmatively filters out -lmemkind from
  LDFLAGS if DEBUG_TYPE is 'sde'. (Thanks to Dave Love for reporting
  this issue.) Other minor cleanups to neighboring code in common.mk.
- Updated CRVECFLAGS in knl/make_defs.mk to be based on -march=knl,
  and then AVX-512 functionality is manually removed via various
  -mno-avx512* flags. Also, make the setting of CRVECFLAGS conditional
  on CC_VENDOR. Similar change to skx/make_defs.mk.
- Comment/whitespace updates.
2018-04-09 14:01:08 -05:00
Field G. Van Zee
bd0276752c Track separate ref kernel flags for each sub-config.
Details:
- Renamed CVECFLAGS variables in sub-configurations' make_defs.mk files
  to CKVECFLAGS.
- Added default defintions of two new make variables to most sub-
  configurations' make_defs.mk files--CROPTFLAGS and CRVECFLAGS--
  which correspond to reference kernel analogues of the CKOPTFLAGS
  and CKVECFLAGS, which track optimization and vectorization flags for
  optimized kernels. Currently, two sub-configurations (knl and skx)
  explicitly set CRVECFLAGS to non-default values (using AVX2 instead of
  AVX-512 for reference kernels. Thanks to Jeff Hammond, whose feedback
  prompted me to make this change (issue #187).
- Changed common.mk so that the get-refkern-cflags-for function returns
  the flags associated with the given sub-configuration's CROPTFLAGS
  and CRVECFLAGS (instead of CKOPTFLAGS and CKVECFLAGS).
2018-04-06 18:51:43 -05:00
Field G. Van Zee
48da9f5805 Tweaked common.mk, Makefile, skx/knl make_defs.mk.
Details:
- Reorganized linker-related section of common.mk so that LDFLAGS set
  in a sub-configuration's make_defs.mk file will not be immediately
  (and erroneously) overridden by the default values.
- Re-enabled redirected (to file) output of the testsuite when run from
  the top-level Makefile via 'make test'. (For some reason, it was
  commented-out for the non-verbose case.)
- Removed old/unnecessary code from the make_defs.mk files of skx and
  knl sub-configurations.
2018-03-07 12:54:06 -06:00
Devin Matthews
f7df64daf6 Don't use memkind for skx configuration. Fixes #163. 2018-01-07 09:37:25 -06:00
Field G. Van Zee
0b3ca3cfb6 Intelligently select compiler for auto-detection.
Details:
- Rewrote code that selects the compiler for the purposes of compiling
  the auto-detection executable. CC (if specified) is tried first. Then
  gcc. Then clang. The absolute fallback is cc. The previous code was
  sort of broken, and seemed to unintentionally always use gcc.
- Moved various configuration-agnostic flags from config/*/make_defs.mk
  files to common.mk. The new mechanism appends the configuration-
  agnostic flags to the various compiler flag variables initialized in
  make_defs.mk. Flags specific to the sub-configuration are still set
  in make_defs.mk.
- Added -Wno-tautological-compare to CMISCFLAGS when clang is in use.
  Also added the flag to the compiler instantiation during configure-
  time hardware detection (when clang is selected).
- Added some missing (but mostly-optional) quotes to configure script.
2018-01-04 20:51:35 -06:00
dnp
4423e33dc5 Adding SKX kernels and configuration. 2017-12-06 16:35:03 -06:00