Commit Graph

379 Commits

Author SHA1 Message Date
Tyler Smith
3e7b0db5b0 Merge branch 'master' of https://github.com/flame/blis 2014-07-23 13:40:44 -05:00
Tyler Smith
2f8a357de5 Some TRSM threading fixes/additions 2014-07-23 13:40:12 -05:00
Field G. Van Zee
ed3e33d548 Tweaked behavior of herk, her2k for BLAS compat.
Details:
- Updated herk_front() and her2k_front() to explicitly set the imaginary
  components of the diagonal entries of C to zero after the computation
  is complete. This is needed in case downstream applications read the
  full diagonal entries (i.e., including imaginary part), which could, in
  the absence of this modification, accumulate numerical error from
  subsequent rank-k/rank-2k updates.
- Updated BLAS compatibility wrappers for herk and her2k to return early
  if:
    n == 0 || ( ( alpha == 0 || k == 0 ) && beta == 1 )
  This also results in the imaginary components of diagonal entries NOT
  being set to zero (see above), which is consistent with BLAS.
- Updated mkherm to use setid instead of an inlined loop over the
  diagonal.
2014-07-22 14:40:43 -05:00
Field G. Van Zee
ea59a5c93c Added new level-1d operation: setid.
Details:
- Defined a new level-1d operation, setid, which sets the imaginary
  elements of an object's diagonal to a single scalar. This can be
  useful, for example, when trying to make the diagonal of a Hermitian
  matrix real-valued.
2014-07-22 14:36:02 -05:00
Field G. Van Zee
8965a96593 Merge branch 'master' of github.com:flame/blis 2014-07-22 14:34:32 -05:00
Field G. Van Zee
1785efb542 Minor improvements to invertd and setd.
Details:
- Added missing call to invertd_check() from front-end.
- Changed setd front-end call of scald_check() to setd_check().
2014-07-22 14:33:01 -05:00
Field G. Van Zee
5b73e80b71 Merge pull request #16 from Maratyszcza/emscripten
Emscripten port
2014-07-18 12:21:20 -05:00
Field G. Van Zee
a41e68e09e Reimplemented BLIS initialization/finalization.
Details:
- Rewrote bli_init() and bli_finalize() with OpenMP critical sections
  for thread-safety. Also added lots of explanatory comments.
- Renamed bli_init_safe() and bli_finalize_safe() with the _auto()
  suffix, and reimplemented for simplicity. Updated all invocations
  in BLAS compatibility layer to use _auto() suffix.
2014-07-17 13:25:56 -05:00
Field G. Van Zee
36358948ea Retired frame/3/gemm/other directory.
Details:
- Removed frame/3/gemm/other directory, which contained some outdated
  and/or experimental variants.
2014-07-17 10:58:10 -05:00
Field G. Van Zee
c73261f17e More minor cleanups post-copyright update. 2014-07-14 16:23:51 -05:00
Field G. Van Zee
2a09d24463 Reverted power7 symlinks destroyed by sed script.
Details:
- Reverted two symlinks, in kernels/power7/3/test, back to being symlinks
  after recursive-sed.sh mistakenly replaced them with copies of the
  actual files to which they referred. Meant to include this in previous
  commit.
2014-07-14 16:17:09 -05:00
Field G. Van Zee
7ed415824d Updated copyright headers (continued).
Details:
- Inserted "at Austin" into third clause of license declarations.
  Meant to include this change in previous commit.
2014-07-14 16:14:33 -05:00
Field G. Van Zee
5c2c6c8561 Updated copyright headers to contain "at Austin".
Details:
- Updated copyright headers to include "at Austin" in the name of the
  University of Texas.
- Updated the copyright years of a few headers to 2014 (from 2011 and
  2012).
2014-07-14 16:05:03 -05:00
Field G. Van Zee
fcec68cda3 Merge branch 'master' of github.com:flame/blis 2014-07-14 11:35:34 -05:00
Field G. Van Zee
94c0df797e Changed order of zero dim / error checking.
Details:
- Updated level-2 and level-3 internal back-ends so that the operation's
  _check() function is called BEFORE any attempt to return early due to
  the presence of zero dimensions. This ordering makes more sense because
  (for example) object dimensions should match even if one of them is
  zero. Previously, a dimension mismatch could result in an early return
  with no error message.
- Updated bli_check_object_buffer() so that NULL buffers result in an
  error only if the object is dimensionally non-empty (i.e., only if both
  of the object's dimensions are non-zero). This allows BLIS operations
  to be performed on dimensionally empty objects (i.e., where at least one
  dimension is zero).
- Updated the error message associated with bli_check_object_buffer()
  to mention the newly relaxed constraint mentioned above, vis-a-vis
  non-zero dimensions.
2014-07-14 11:24:36 -05:00
Marat Dukhan
20690fe301 Emscripten port 2014-07-13 22:50:56 -07:00
Field G. Van Zee
4a20ed1a3f Merge pull request #14 from Maratyszcza/master
Support "make test" for PNaCl configuration
2014-07-13 17:45:01 -05:00
Field G. Van Zee
6a515e988f Implemented dsdot() and sdsdot() in compat layer.
Details:
- Replaced "not yet implemented" error messages in dsdot() and sdsdot()
  with actual implementations. (These routines are so rarely used that
  this log message will probably lead to some people learning of their
  existence for the first time.)
2014-07-13 17:38:33 -05:00
Field G. Van Zee
255668ddd1 Inserted gemv beta-scaling bug into compat layer.
Details:
- BLAS has a peculiar bug (or feature) whereby calling gemv on a vector
  y of non-zero length and a vector x of zero length results in no action.
  Given that the operation is y := beta*y + A*x, many (most?) individuals
  would expect vector y to still be scaled by beta. BLIS, when called
  natively, handles these cases intuitively (with beta scaling).
  Unfortunately, many BLAS test suites actually check for the way this
  situation is handled. Therefore, we have decided to implement this "bug"
  in the compatibility layer so as to provide "bug-for-bug" compatibility
  with BLAS.
2014-07-13 17:30:44 -05:00
Field G. Van Zee
570a154581 Comment/formatting updates to build scripts.
Details:
- Minor updates to comments and formatting in bump-version.sh and
  update-version-file.sh scripts.
2014-07-12 17:51:05 -05:00
Field G. Van Zee
26cd819906 Added bli_info_*() query functions.
Details:
- Added a new API family, bli_info_*(), which can be used to query
  information about how BLIS was configured. Most of these values are
  returned as gint_t, with the exception of the version string which
  is char*.
- Changed how the testsuite driver queries information about how BLIS
  was configured (from using macro constants directly to using the
  new bli_info API).
- Removed bli_version.c and its header file.
- Added STRINGIFY_INT() macro to bli_macro_defs.h
- Renamed info_t type in bli_type_defs.h to objbits_t (not because of
  an actual naming conflict, but because the name 'info_t' would now be
  somewhat misleading in the presence of the new bli_info API, as the
  two are unrelated).
2014-07-10 13:16:07 -05:00
Field G. Van Zee
970b431416 Minor bugfixes to BLAS compatibility layer.
Details:
- Changed bla_amax.c so that i?amax() routines now correctly return 0
  if ( n < 1 || incx <= 0 ).
- Changed bla_rotg.c and bla_rotmg.c to use bli_fabs() macro instead of
  f2c's abs() macro for float and double cases.
- Thanks to Murtaza Ali for suggesting the two fixes above.
- Updated label of fnormv to normfv in testsuite/input.operations.
2014-07-10 09:30:00 -05:00
Marat Dukhan
8ccdfaef4c Replicated logic from testsuite/Makefile in top-level Makefile to support make test 2014-07-08 23:14:36 -07:00
Field G. Van Zee
caa6507ff3 Minor cleanup to standalone test drivers.
Details:
- Very minor code changes to standalone test drivers in 'test' directory.
- Added *.so files to '.gitignore'.
2014-07-08 10:25:27 -05:00
Field G. Van Zee
6c65e9a58f Merge branch 'master' of github.com:flame/blis 2014-07-08 10:13:49 -05:00
Field G. Van Zee
cb12e456f9 Fixed possible level-3 inf/NaN issue when beta=0.
Details:
- Redefined xpbys_mxn and xpbys_mxn_u/_l macros to employ a copy
  (instead of scaling by beta) when beta is zero. This will stamp out
  any possible infs or NaNs in the output matrix, if it happens to be
  uninitialized. Thanks to Tony Kelman for isolating this bug.
2014-07-08 10:07:46 -05:00
Tyler Smith
daca500db5 Merge branch 'master' of http://github.com/flame/blis 2014-07-03 12:52:52 -05:00
Field G. Van Zee
4702350278 Defined _ukernel_void() wrappers to micro-kernels.
Details:
- Added wrappers for micro-kernels so that users may invoke the
  micro-kernels without knowing what the function names actually are.
  This is useful when an application wishes to call the micro-kernel
  from a shared library instance of BLIS, where the application may not
  necessarily have the luxury of grabbing the micro-kernel name(s) from
  C preprocessor macros at compile-time. Also, since the wrappers use
  void* pointers, one's environment does not need to be aware of some
  BLIS types such as scomplex and dcomplex. These wrappers now join the
  level-1 and level-1f kernel wrappers, which pre-dated this commit.
- Removed the wrapper definitions and prototypes from the micro-kernel
  test suite modules, and replaced calls to them with calls to the new
  wrappers mentioned above.
2014-07-03 11:48:23 -05:00
Tyler Smith
ab3bc9153b Fixed a bug for TRSM when BLIS_ENABLE_MULTITHREADING is not set but the multithreading environment variables are turned on 2014-07-03 11:19:43 -05:00
Tyler Smith
b8134b720b Quick and dirty multithreading for TRSM
Should work fine for small number of threads (up to 8 or maybe even 16).
However, performance is yet untested.
This parallelizes the "JR" loop for the left sided cases
and the "IR" loop for the right sided cases.

Future work is to parallelize the outer loops as well.
2014-07-02 16:02:39 -05:00
Field G. Van Zee
e8ef696928 Added shared library support to build system.
Details:
- Modified top-level Makefile to support building shared (dynamic)
  libraries.
- Updated most configurations' make_defs.mk files to include necessary
  compiler/linker flags needed by top-level Makefile.
- Note that by default, all configurations presently do NOT build
  shared libraries. To enable, one must change the value of
  BLIS_ENABLE_DYNAMIC_BUILD to 'yes'.
2014-07-02 14:59:27 -05:00
Field G. Van Zee
b80df0f2cf Added bump-version.sh script to 'build' directory.
Details:
- Added a bash script, bump-version.sh, to aid in incrementing the BLIS
  version string.
2014-06-23 13:52:39 -05:00
Field G. Van Zee
9ef1f1e21d CHANGELOG update (0.1.3) 2014-06-23 13:48:17 -05:00
Field G. Van Zee
036cc63491 Version file update (0.1.3) 0.1.3 2014-06-23 13:48:17 -05:00
Field G. Van Zee
09d9a3bf67 Reverting version file to test new version script.
Details:
- Changed version file contents to 0.1.2 so that I can test out a new
  version file bumping script.
2014-06-23 13:43:26 -05:00
Field G. Van Zee
ebb3396598 Added 'version' file. 2014-06-23 11:22:50 -05:00
Field G. Van Zee
2cb9a5501a Removed 'version' from .gitignore file. 2014-06-23 10:42:29 -05:00
Field G. Van Zee
b40dcefc5e Merge pull request #11 from Maratyszcza/stable
[sc]axpy kernels for PNaCl
2014-06-23 10:39:05 -05:00
Marat Dukhan
b693b0cddc [SC]AXPY kernels for PNaCl 2014-06-22 13:44:25 -07:00
Field G. Van Zee
7101a8eec0 Merge pull request #10 from Maratyszcza/stable
Portable Native Client port
2014-06-19 21:46:50 -05:00
Marat Dukhan
020a831bc5 Code clean-up in PNaCl port 2014-06-19 00:58:26 -07:00
Marat Dukhan
491be4f91e Optimized dot product kernels for PNaCl 2014-06-19 00:45:44 -07:00
Marat Dukhan
4b8e71aab8 Use AR rcs flags for PNaCl target to avoid warning 2014-06-19 00:43:25 -07:00
Marat Dukhan
031deb2a5c PNaCl configuration: use pnacl-ar instead or ar (fixes build issue on Mac) 2014-06-18 03:11:34 -07:00
Marat Dukhan
68a02976e3 Compile pnacl configuration in GNU11 mode to avoid warning about non-standard features 2014-06-18 03:10:25 -07:00
Marat Dukhan
6f8462eb0e Fix inconsistent VERBOSE macro in Makefile 2014-06-18 03:08:46 -07:00
Marat Dukhan
b2ffb4de8b Reformatted PNaCl GEMM kernels 2014-06-15 18:41:30 -04:00
Marat Dukhan
6de2d472d9 CGEMM and ZGEMM kernels for PNaCl 2014-06-15 08:44:31 -04:00
Marat Dukhan
f064711a5e SGEMM and DGEMM kernels for PNaCl 2014-06-15 06:27:37 -04:00
Field G. Van Zee
ad48dca229 Merge pull request #9 from tkelman/memalign_windows
Use _aligned_malloc instead of posix_memalign on Windows
2014-06-14 15:10:13 -05:00