Commit Graph

62 Commits

Author SHA1 Message Date
Field G. Van Zee
e320ec6d5c Moved lang defs from _macro_def.h to _lang_defs.h.
Details:
- Moved miscellaneous language-related definitions, including defs
  related to the handling of the 'restrict' keyword, from the top half
  of bli_macro_defs.h into a new file, bli_lang_defs.h, which is now
  #included immediately after "bli_system.h" in blis.h. This change is
  an attempt to fix a report of recent breakage of C++ compilers due
  to the recent introduction of 'restrict' in bli_type_defs.h (which
  previously was being included *before* bli_macro_defs.h and its
  restrict handling therein. Thanks to Ivan Korostelev for reporting
  this issue in #527.
- CREDITS file update.
2021-08-20 17:15:20 -05:00
Field G. Van Zee
868b90138e Fixed one-time use property of bli_init() (#525).
Details:
- Fixes a rather obvious bug that resulted in segmentation fault
  whenever the calling application tried to re-initialize BLIS after
  its first init/finalize cycle. The bug resulted from the fact that
  the bli_init.c APIs made no effort to allow bli_init() to be called
  subsequent times at all due to it, and bli_finalize(), being
  implemented in terms of pthread_once(). This has been fixed by
  resetting the pthread_once_t control variable for initialization
  at the end of bli_finalize_apis(), and by resetting the control
  variable for finalization at the end of bli_init_apis(). Thanks to
  @lschork2 for reporting this issue (#525), and to Minh Quan Ho and
  Devin Matthews for suggesting the chosen solution.
- CREDITS file update.
2021-08-04 18:31:01 -05:00
Field G. Van Zee
8dba1e752c CREDITS file update. 2021-07-27 12:38:24 -05:00
Field G. Van Zee
69205ac266 CREDITS file update.
Details:
- Thanks to Chengguo Sun for submitting #515 (5ef7f68).
- Thanks to Andrew Wildman for submitting #519 (551c6b4).
- Whitespace update to configure (spaces to tabs).
2021-07-06 20:39:22 -05:00
Devin Matthews
5d46dbee4a Replace bli_dlamch with something less archaic (#498)
Details:
- Added new implementations of bli_slamch() and bli_dlamch() that use
  constants from the standard C library in lieu of dynamically-computed
  values (via code inherited from netlib). The previous implementation
  is still available when the cpp macro BLIS_ENABLE_LEGACY_LAMCH is 
  defined by the subconfiguration at compile-time. Thanks to Devin
  Matthews for providing this patch, and to Stefano Zampini for
  reporting the issue (#497) that prompted Devin to propose the patch.
2021-05-12 18:42:09 -05:00
Field G. Van Zee
ca83f955d4 CREDITS file update. 2021-03-22 17:21:21 -05:00
Field G. Van Zee
ed50c94738 Merge branch 'master' into dev 2021-01-04 14:31:44 -06:00
Field G. Van Zee
b9899bedff CREDITS file update. 2020-11-18 16:52:41 -06:00
Field G. Van Zee
9bb23e6c2a Added support for systemless build (no pthreads).
Details:
- Added a configure option, --[enable|disable]-system, which determines
  whether the modest operating system dependencies in BLIS are included.
  The most notable example of this on Linux and BSD/OSX is the use of
  POSIX threads to ensure thread safety for when application-level
  threads call BLIS. When --disable-system is given, the bli_pthreads
  implementation is dummied out entirely, allowing the calling code
  within BLIS to remain unchanged. Why would anyone want to build BLIS
  like this? The motivating example was submitted via #454 in which a
  user wanted to build BLIS for a simulator such as gem5 where thread
  safety may not be a concern (and where the operating system is largely
  absent anyway). Thanks to Stepan Nassyr for suggesting this feature.
- Another, more minor side effect of the --disable-system option is that
  the implementation of bli_clock() unconditionally returns 0.0 instead
  of the time elapsed since some fixed point in the past. The reasoning
  for this is that if the operating system is truly minimal, the system
  function call upon which bli_clock() would normally be implemented
  (e.g. clock_gettime()) may not be available.
- Refactored preprocess-guarded code in bli_pthread.c and bli_pthread.h
  to remove redundancies.
- Removed old comments and commented #include of "bli_pthread_wrap.h"
  from bli_system.h.
- Documented bli_clock() and bli_clock_min_diff() in BLISObjectAPI.md
  and BLISTypedAPI.md, with a note that both are non-functional when
  BLIS is configured with --disable-system.
2020-11-16 15:55:45 -06:00
Field G. Van Zee
234b8b0cf4 Increased dotxaxpyf testsuite thresholds.
Details:
- Increased the test thresholds used by the dotxaxpyf testsuite module
  by a factor of five in order to avoid residuals that unnecessarily
  fall in the MARGINAL range. This commit should fix #455. Thanks to
  @nagsingh for reporting this issue.
2020-11-12 19:11:16 -06:00
Field G. Van Zee
2a0682f8e5 Implemented runtime subconfig selection (#451).
Details:
- Implemented support for the user manually overriding the automatic
  subconfiguration selection that happens at runtime. This override
  can be requested by setting the BLIS_ARCH_TYPE environment variable.
  The variable must be set to the arch_t id (as enumerated in
  bli_type_defs.h) corresponding to the desired subconfiguration. If a
  value outside this enumerated range is given, BLIS will abort with an
  error message. If the value is in the valid range but corresponds to a
  subconfiguration that was not activated at configure-time/compile-time,
  BLIS will abort with a (different) error message. Thanks to decandia50
  for suggesting this feature via issue #451.
- Defined a new function bli_gks_lookup_id to return the address of an
  internal data structure within the gks. If this address is NULL, then
  it indicates that the subconfig corresponding to the arch_t id passed
  into the function was not compiled into BLIS. This function is used
  in the second of the two abort scenarios described above.
- Defined the enumerated error code BLIS_UNINITIALIZED_GKS_CNTX, which
  is returned for the latter of the two abort scenarios mentioned above,
  along with a corresponding error message and a function to perform
  the error check.
- Added cpp macro branching to bli_env.c to support compilation of the
  auto-detect.x executable during configure-time. This cpp branch is
  similar to the cpp code already found in bli_arch.c and bli_cpuid.c.
- Cleaned up the auto_detect() function to facilitate easier maintenance
  going forward. Also added a convenient debug switch that outputs the
  compilation command for the auto-detect.x executable and exits.
2020-10-18 18:04:03 -05:00
Field G. Van Zee
a69a4d7e2f Cleaned up bool_t usage and various typecasts.
Details:
- Fixed various typecasts in

    frame/base/bli_cntx.h
    frame/base/bli_mbool.h
    frame/base/bli_rntm.h
    frame/include/bli_misc_macro_defs.h
    frame/include/bli_obj_macro_defs.h
    frame/include/bli_param_macro_defs.h

  that were missing or being done improperly/incompletely. For example,
  many return values were being typecast as
    (bool_t)x && y
  rather than
    (bool_t)(x && y)
  Thankfully, none of these deficiencies had manifested as actual bugs
  at the time of this commit.
- Changed the return type of bli_env_get_var() from dim_t to gint_t.
  This reflects the fact that bli_env_get_var() needs to be able to
  return a signed integer, and even though dim_t is currently defined
  as a signed integer, it does not intuitively appear to necessarily be
  signed by inspection (i.e., an integer named "dim_t" for matrix
  "dimension"). Also, updated use of bli_env_get_var() within
  bli_pack.c to reflect the changed return type.
- Redefined type of thrcomm_t.barrier_sense field from bool_t to gint_t
  and added comments to the bli_thrcomm_*.h files that will explain a
  planned replacement of bool_t with C99's bool type.
- Note: These changes are being made to facilitate the substitution of
  'bool' for 'bool_t', which will eliminate the namespace conflict with
  arm_sve.h as reported in issue #420. This commit implements the first
  phase of that transition. Thanks to RuQing Xu for reporting this
  issue.
- CREDITS file update.
2020-07-22 16:13:09 -05:00
Field G. Van Zee
72f6ed0637 Declare/define static functions via BLIS_INLINE.
Details:
- Updated all static function definitions to use the cpp macro
  BLIS_INLINE instead of the static keyword. This allows blis.h to
  use a different keyword (inline) to define these functions when
  compiling with C++, which might otherwise trigger "defined but
  not used" warning messages. Thanks to Giorgos Margaritis for
  reporting this issue and Devin Matthews for suggesting the fix.
- Updated the following files, which are used by configure's
  hardware auto-detection facility, to unconditionally #define
  BLIS_INLINE to the static keyword (since we know BLIS will be
  compiled with C, not C++):
    build/detect/config/config_detect.c
    frame/base/bli_arch.c
    frame/base/bli_cpuid.c
- CREDITS file update.
2020-07-03 17:55:54 -05:00
Field G. Van Zee
b3c4201681 CREDITS file update. 2020-06-18 14:00:56 -05:00
Field G. Van Zee
477ce91c52 Moved #include "cpuid.h" to bli_cpuid.c.
Details:
- Relocated the #include "cpuid.h" directive from bli_cpuid.h to
  bli_cpuid.c. This was done because cpuid.h (which is pulled into
  the post-build blis.h developer header) doesn't protect its
  definitions with a preprocessor guard of the form:

    #ifndef FOOBAR_H
    #define FOOBAR_H
    // header contents.
    #endif

  and as a result, applications (previously) could not #include both
  blis.h and cpuid.h (since the former was already including the
  latter). Thanks to Bhaskar Nallani for raising this issue via #393
  and to Devin Matthews for suggesting this fix.
- CREDITS file update.
2020-04-22 14:26:49 -05:00
Satish Balay
da0c086f46 OSX: specify the full path to the location of libblis.dylib (#390)
* OSX: specify the full path to the location of libblis.dylib so that it can be found at runtime

Before this change:

Appication gives runtime error [when linked with blis]
dyld: Library not loaded: libblis.3.dylib

balay@kpro lib % otool -L libblis.dylib
libblis.dylib:
        libblis.3.dylib (compatibility version 0.0.0, current version 0.0.0)
        /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1281.0.0)

After this change:
balay@kpro lib % otool -L libblis.dylib
libblis.dylib:
	/Users/balay/petsc/arch-darwin-c-debug/lib/libblis.3.dylib (compatibility version 0.0.0, current version 0.0.0)
	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1281.0.0)

* INSTALL_LIBDIR -> libdir as INSTALL_LIBDIR has DESTDIR

Co-Authored-By: Jed Brown <jed@jedbrown.org>

* CREDITS file update.

Co-authored-by: Jed Brown <jed@jedbrown.org>
Co-authored-by: Field G. Van Zee <field@cs.utexas.edu>
2020-03-31 17:09:41 -05:00
Field G. Van Zee
e186d7141a Disabled optimized amaxv kernels.
Details:
- Disabled use of optimized amaxv kernels, which use vector intrinsics
  for both 's' and 'd' datatypes. We disable these kernels because the
  current implementations fail to observe a semantic property of the
  BLAS i?amax_() subroutine, which is to return the index of the
  *first* element containing the maximum absolute value (that is, the
  first element if there exist two or more elements that contain the
  same value). With the optimized kernels disabled, the affected
  subconfigurations (haswell, zen, zen2, knl, and skx) will use the
  default reference implementations. Thanks to Mat Cross for reporting
  this issue via #380.
- CREDITS file update.
2020-03-21 18:40:36 -05:00
Field G. Van Zee
d7a7679182 Fixed int-to-packbuf_t conversion error (C++ only).
Details:
- Fixed an error that manifests only when using C++ (specifically,
  modern versions of g++) to compile drivers in 'test' (and likely most
  other application code that #includes blis.h. Thanks to Ajay Panyala
  for reporting this issue (#374).
2020-02-07 17:37:03 -06:00
Field G. Van Zee
7d3407d468 CREDITS file update. 2020-01-14 15:17:53 -06:00
Field G. Van Zee
5271107378 Fixed bugs in cblas_sdsdot(), sdsdot_().
Details:
- Fixed a bug in sdsdot_sub() that redundantly added the "alpha" scalar,
  named 'sb'. This value was already being added by the underlying
  sdsdot_() function. Thus, we no longer add 'sb' within sdsdot_sub().
  Thanks to Simon Lukas Märtens for reporting this bug via #367.
- Fixed a second bug in order of typecasting intermediate products in
  sdsdot_(). Previously, the "alpha" scalar was being added after the
  "outer" typecast to float. However, the operation is supposed to first
  add the dot product to the (promoted) scalar and THEN downcast the sum
  to float. Thanks to Devin Matthews for catching this bug.
2019-12-16 16:30:26 -06:00
Field G. Van Zee
fe2560a4b1 Annoted missing thread-related symbols for export.
Details:
- Added BLIS_EXPORT_BLIS annotation to function prototypes for

    bli_thrcomm_bcast()
    bli_thrcomm_barrier()
    bli_thread_range_sub()

  so that these functions are exported to shared libraries by default.
  This (hopefully) fixes issue #366. Thanks to Kyungmin Lee for
  reporting this bug.
- CREDITS file update.
2019-12-06 17:12:44 -06:00
Field G. Van Zee
8f399c8940 Tweaked/added notes to docs/Multithreading.md.
Details:
- Added language to docs/Multithreading.md cautioning the reader about
  the nuances of setting multithreading parameters via the manual and
  automatic ways simultaneously, and also about how these parameters
  behave when multithreading is disabled at configure-time. These
  changes are an attempt to address the issues that arose in issue #362.
  Thanks to Jérémie du Boisberranger for his feedback on this topic.
- CREDITS file update.
2019-11-12 15:32:57 -06:00
Field G. Van Zee
bc16ec7d1e Set execute bits of shared library at install-time.
Details:
- Modified the 0644 octal code used during installation of shared
  libraries to 0755 (for Linux/OSX only). Thanks to Adam J. Stewart
  for reporting this issue via #343.
- CREDITS file update.
2019-09-23 15:37:33 -05:00
Field G. Van Zee
fd9bf497cd CREDITS file update. 2019-09-17 15:45:24 -05:00
Devin Matthews
7c78191457 Always use sqsumv to compute normfv. (#334)
* Always use sqsumv to compute normfv on MacOS.

* Unconditionally disable the "dot trick" in normfv.

* Added explanatory comment to normfv definition.

Details:
- Added a comment above the unconditional disabling of the dotv-based
  implementation to normfv. Thanks to Roman Yurchak, Devin Matthews,
  and Isuru Fernando in helping with this improvement.
- CREDITS file update.
2019-08-30 16:52:09 -05:00
Field G. Van Zee
0f1b3bf49e ReleaseNotes.md update in advance of next version.
Details:
- Updated ReleaseNotes.md in preparation for next version.
- CREDITS file update.
2019-06-03 18:35:19 -05:00
Field G. Van Zee
89a70cccf8 GNU-like handling of installation prefix et al.
Details:
- Changed the default installation prefix from $HOME/lib to /usr/local.
- Modified the way configure internally handles the prefix, libdir,
  includedir, and sharedir (and also added an --exec-prefix option).
  The defaults to these variables are set as follows:
    prefix:      /usr/local
    exec_prefix: ${prefix}
    libdir:      ${exec_prefix}/lib
    includedir:  ${prefix}/include
    sharedir:    ${prefix}/share
  The key change, aside from the addition of exec_prefix and its use to
  define the default to libdir, is that the variables are substituted
  into config.mk with quoting that delays evaluation, meaning the
  substituted values may contain unevaluated references to other
  variables (namely, ${prefix} and ${exec_prefix}). This more closely
  follows GNU conventions, including those used by GNU autoconf, and
  also allows make to override any one of the variables *after*
  configure has already been run (e.g. during 'make install').
- Updates to build/config.mk.in pursuant to above changes.
- Updates to output of 'configure --help' pursuant to above changes.
- Updated docs/BuildSystem.md to reflect the new default installation
  prefix, as well as mention EXECPREFIX and SHAREDIR.
- Changed the definitions of the UNINSTALL_OLD_* variables in the
  top-level Makefile to use $(wildcard ...) instead of 'find'. This
  was motivated by the new way of handling prefix and friends, which
  leads to the 'find' command being run on /usr/local (by default),
  which can take a while almost never yielding any benefit (since the
  user will very rarely use the uninstall-old targets).
- Removed periods from the end of descriptive output statements (i.e.,
  non-verbose output) since those statements often end with file or
  directory paths, which get confusing to read when puctuated by a
  period.
- Trival change to 'make showconfig' output.
- Removed my name from 'configure --help'. (Many have contributed to it
  over the years.)
- In configure script, changed the default state of threading_model
  variable from 'no' to 'off' to match that of debug_type, where there
  are similarly more than two valid states. ('no' is still accepted
  if given via the --enable-debug= option, though it will be
  standardized to 'off' prior to config.mk being written out.)
- Minor variable name change in flatten-headers.py that was intended for
  32812ff.
- CREDITS file update.
2019-04-11 18:33:08 -05:00
Field G. Van Zee
2c85e1dd9d Added Eigen results to performance graphs.
Details:
- Updated the Haswell, SkylakeX, and Epyc performance graphs in
  docs/graphs to report on Eigen implementations, where applicable.
  Specifically, Eigen implements all level-3 operations sequentially,
  however, of those operations it only provides multithreaded gemm.
  Thus, mt results for symm/hemm, syrk/herk, trmm, and trsm are
  omitted. Thanks to Sameer Agarwal for his help configuring and
  using Eigen.
- Updated docs/Performance.md to note the new implementation tested.
- CREDITS file update.
2019-03-27 16:29:51 -05:00
Field G. Van Zee
913cf97653 Added docs/Performance.md and docs/graphs subdir.
Details:
- Added a new markdown document, docs/Performance.md, which reports
  performance of a representative set of level-3 operations across a
  variety of hardware architectures, comparing BLIS to OpenBLAS and a
  vendor library (MKL on Intel/AMD, ARMPL on ARM). Performance graphs,
  in pdf and png formats, reside in docs/graphs.
- Updated README.md to link to new Performance.md document.
- Minor updates to CREDITS, docs/Multithreading.md.
- Minor updates to matlab scripts in test/3/matlab.
2019-03-19 16:15:24 -05:00
Field G. Van Zee
5a5f494e42 Removed export macros from all internal prototypes.
Details:
- After merging PR #303, at Isuru's request, I removed the use of
  BLIS_EXPORT_BLIS from all function prototypes *except* those that we
  potentially wish to be exported in shared/dynamic libraries. In other
  words, I removed the use of BLIS_EXPORT_BLIS from all prototypes of
  functions that can be considered private or for internal use only.
  This is likely the last big modification along the path towards
  implementing the functionality spelled out in issue #248. Thanks
  again to Isuru Fernando for his initial efforts of sprinkling the
  export macros throughout BLIS, which made removing them where
  necessary relatively painless. Also, I'd like to thank Tony Kelman,
  Nathaniel Smith, Ian Henriksen, Marat Dukhan, and Matthew Brett for
  participating in the initial discussion in issue #37 that was later
  summarized and restated in issue #248.
- CREDITS file update.
2019-03-12 18:45:09 -05:00
Field G. Van Zee
fffc23bb35 CREDITS file update. 2019-01-25 13:35:31 -06:00
Field G. Van Zee
ad8d9adb09 README.md, CREDITS update.
Details:
- Added "What's New" and "What People Are Saying About BLIS" sections to
  README.md.
- Added missing github handles to various individuals' entries in the
  CREDITS file.
2019-01-03 16:08:24 -06:00
Field G. Van Zee
93d56319f2 Added missing bli_init_once() in bli_thread API.
Details:
- Fixed an issue with specifying threading globally at runtime via
  bli_thread_set_num_threads() (the automatic way) or via
  bli_thread_set_ways() (the manual way), with bli_thread_init_rntm()
  also affected. These functions were not calling bli_init_once() prior
  to acting, and therefore their effects on the global rntm_t structure
  were being wiped out by the eventual call to bli_init_once(), by some
  other BLIS function. Thanks to Ali Emre Gülcü for reporting the
  behavior associated with this bug.
- Added additional content to docs/Multithreading.md covering topics of
  choosing between OpenMP and pthreads, and specifying affinity via
  OpenMP.
- CREDITS file update.
2018-12-17 19:17:30 -06:00
Field G. Van Zee
dc18409551 CREDITS file update. 2018-11-28 11:58:40 -06:00
Field G. Van Zee
7b02c72665 CREDITS file update. 2018-11-14 13:49:55 -06:00
Field G. Van Zee
57eab3a4f0 CREDITS file update. 2018-10-17 11:29:20 -05:00
Field G. Van Zee
667d3929ee Added Fortran APIs for some thread functions.
Details:
- Defined Fortran-77 compatible APIs for bli_thread_set_num_threads()
  and bli_thread_set_ways(). These wrappers are defined in
  frame/compat/blis/thread/b77_thread.c. Thanks to Kay Dewhurst for
  suggesting these new interfaces.
- Added missing prototype for bli_thread_set_ways() in bli_thread.h and
  removed prototypes for non-existent functions bli_thread_set_*_nt().
- CREDITS file update.
2018-10-11 11:47:57 -05:00
Field G. Van Zee
b952ca8feb CREDITS file update. 2018-09-28 16:12:32 -05:00
Field G. Van Zee
4f6745d68a Fixed link error when building only shared library.
Details:
- Fixed a linker error that occurred when attempting to compile and link
  the testsuite and/or BLAS test drivers after having configured BLIS to
  only generate a shared library (no static library). The chosen
  solution involved
  (1) adding the local library path, $(BASE_LIB_PATH), to the search
      paths for the shared library via the link option
      -Wl,-rpath,$(BASE_LIB_PATH).
  (2) adding a local symlink to $(BASE_LIB_PATH) that uses the .so major
      version number so that ld would find the shared library at
      execution time.
  Thanks to Sajid Ali for reporting this issue, to Devin Matthews for
  pointing out the need for the -rpath option, and to Devangi Parikh for
  helping Sajid isolate the problem.
- Added #include <ctype.h> to bli_system.h to avoid a compiler warning
  resulting from using toupper() from bli_string.c without a prototype.
  Thanks again to Sajid Ali, whose build log revealed this compiler
  warning.
- Added '*.so.*' to .gitignore.
- CREDITS file update.
2018-08-14 16:50:47 -05:00
Field G. Van Zee
29c34c4adb CREDITS file update. 2018-07-27 16:26:19 -05:00
Field G. Van Zee
a8b4084a0e CREDITS file update. 2018-07-27 16:07:26 -05:00
Field G. Van Zee
8e10cac5f3 Updates to CREDITS, RELEASING, config/README.md.
Details:
- Added individuals' github handles to CREDITS file.
- Updated RELEASING, config/README.md files.
2018-07-27 14:45:35 -05:00
Field G. Van Zee
038442add3 Added -lpthread to makefile example in BuildSystem.md.
Details:
- Added missing pthreads library linking to example makefile in
  docs/BuildSystem.md, as well as similar language to build requirements
  at the beginning of the document. Thanks to Stefanos Mavros for
  bringing this to our attention.
- Updated CREDITS file.
2018-07-11 12:24:18 -05:00
Field G. Van Zee
89e178ce38 Merge branch 'master' into dev 2018-07-04 17:51:16 -05:00
Field G. Van Zee
e32b2ef983 Update to CREDITS file. 2018-07-04 17:49:39 -05:00
Field G. Van Zee
195480beb5 Merge branch 'master' into dev 2018-06-25 13:24:21 -05:00
Field G. Van Zee
07c3d0a951 Update to CREDITS file. 2018-06-21 12:35:07 -05:00
Field G. Van Zee
2000cdff59 Update to CREDITS file. 2018-06-18 14:17:28 -05:00
Field G. Van Zee
5df201260f Merge branch 'master' into dev 2018-06-05 16:14:19 -05:00
Field G. Van Zee
2c6d99b99e Fixed names out of alphabetical order in CREDITS. 2018-06-03 18:13:36 -05:00