Commit Graph

1537 Commits

Author SHA1 Message Date
Field G. Van Zee
0645f239fb Remove UT-Austin from copyright headers' clause 3.
Details:
- Removed explicit reference to The University of Texas at Austin in the
  third clause of the license comment blocks of all relevant files and
  replaced it with a more all-encompassing "copyright holder(s)".
- Removed duplicate words ("derived") from a few kernels' license
  comment blocks.
- Homogenized license comment block in kernels/zen/3/bli_gemm_small.c
  with format of all other comment blocks.
2018-12-04 14:31:06 -06:00
Field G. Van Zee
9b688a2d69 Refer to color mm algorithm in Multithreading.md. 2018-12-04 13:30:25 -06:00
Field G. Van Zee
22384fd2b7 Minor updates to test_gemm.c in test/mixeddt. 2018-12-04 13:09:04 -06:00
Field G. Van Zee
2ba3b1780c Removed symbols from libblis-symbols.def.
Details:
- Removed bli_gemm_md_front() and bli_gemm_md_zgemm() symbols from
  build/libblis-symbols.def, which will hopefully appease AppVeyor.
2018-12-03 19:40:39 -06:00
Field G. Van Zee
dcb38c4e59 Merge branch 'dev' 2018-12-03 18:06:19 -06:00
Field G. Van Zee
375eb30b0a Added mixed-precision support to 1m method.
Details:
- Lifted the constraint that 1m only be used when all operands' storage
  datatypes (along with the computation datatype) are equal. Now, 1m may
  be used as long as all operands are stored in the complex domain. This
  change largely consisted of adding the ability to pack to 1e and 1r
  formats from one precision to another. It also required adding logic
  for handling complex values of alpha to bli_packm_blk_var1_md()
  (similar to the logic in bli_packm_blk_var1()).
- Fixed a bug in several virtual microkernels (bli_gemm_md_c2r_ref.c,
  bli_gemm1m_ref.c, and bli_gemmtrsm1m_ref.c) that resulted in the wrong
  ukernel output preference field being read. Previously, the preference
  for the native complex ukernel was being read instead of the pref for
  the native real domain ukernel. This bug would not manifest if the
  preference for the native complex ukernel happened to be equal to that
  of the native real ukernel.
- Added support for testing mixed-precision 1m execution via the gemm
  module of the testsuite.
- Tweaked/simplified bli_gemm_front() and bli_gemm_md.c so that pack
  schemas are always read from the context, rather than trying to
  sometimes embed them directly to the A and B objects. (They are still
  embedded, but now uniformly only after reading the schemas from the
  context.)
- Redefined cpp macro bli_l3_ind_recast_1m_params() as a static function
  and renamed to bli_gemm_ind_recast_1m_params() (since gemm is the only
  consumer).
- Added 1m optimization logic (via bli_gemm_ind_recast_1m_params()) to
  bli_gemm_ker_var2_md().
- Added explicit handling for beta == 1 and beta == 0 in the reference
  gemm1m virtual microkernel in ref_kernels/ind/bli_gemm1m_ref.c.
- Rewrote various level-0 macro defs, including axpyris, axpbyris,
  scal2ris, and xpbyris (and their conjugating counterparts) to
  explicitly support three operand types and updated invocations to
  xpbyris in bli_gemmtrsm1m_ref.c.
- Query and use the storage datatype of the packed object instead of the
  storage datatype of the source object in bli_packm_blk_var1().
- Relocated and renamed frame/ind/misc/bli_l3_ind_opt.h to
  frame/3/gemm/ind/bli_gemm_ind_opt.h.
- Various whitespace/comment updates.
2018-12-03 17:49:52 -06:00
Field G. Van Zee
dc18409551 CREDITS file update. 2018-11-28 11:58:40 -06:00
Field G. Van Zee
ee4d271296 Merge pull request #287 from SuperFluffy/fix_configuration_links
Fix configuration links
2018-11-28 11:52:57 -06:00
Richard Janis Goldschmidt
3d7e8bc3b8 Fix configuration links 2018-11-28 15:56:37 +01:00
Field G. Van Zee
6a4885f8be Merge branch 'master' into dev 2018-11-27 13:22:59 -06:00
Field G. Van Zee
e81c4b5666 Merge pull request #285 from isuruf/pthread
Move LDFLAGS to the end
2018-11-21 17:00:49 -06:00
Isuru Fernando
cfbdb58de2 Move LDFLAGS to the end
Otherwise the linker will drop flags like -lpthread
2018-11-21 14:23:39 -06:00
Field G. Van Zee
757043eae8 Merge pull request #283 from isuruf/patch-3
Fix MinGW and Cygwin build failures
2018-11-21 13:07:26 -06:00
Isuru Fernando
7af8fa0137 Fix blis dll path 2018-11-21 02:10:05 -06:00
Isuru Fernando
2acd8dcd23 Fix install path of dll.a 2018-11-21 02:02:18 -06:00
Isuru Fernando
b7b0ad22b1 Test mingw 2018-11-21 01:54:44 -06:00
Isuru Fernando
bafe521ed0 Fixes for mingw 2018-11-21 01:54:36 -06:00
Isuru Fernando
be831879bd test gcc shared 2018-11-21 01:39:32 -06:00
Isuru Fernando
f6b924648c Don't use .def for gcc 2018-11-21 01:39:19 -06:00
Isuru Fernando
ce6e4eae6d test no threading 2018-11-21 01:34:56 -06:00
Isuru Fernando
c9169b4685 Add mingw64 path 2018-11-21 01:17:36 -06:00
Isuru Fernando
0f753090ea Fix PATH 2018-11-21 01:14:52 -06:00
Isuru Fernando
d424470b1f Check openmp and pthreads threading 2018-11-21 01:04:26 -06:00
Isuru Fernando
c73e7601e5 Revert "enable rdp"
This reverts commit 368274bcbd.
2018-11-21 00:50:33 -06:00
Isuru Fernando
6209b2e606 Remove conda 2018-11-21 00:50:22 -06:00
Isuru Fernando
0b1b344447 Fix make name 2018-11-21 00:42:39 -06:00
Isuru Fernando
7a9838983b Use m2w64-make 2018-11-21 00:35:27 -06:00
Isuru Fernando
4c1dedd6a9 No activate on gcc 2018-11-21 00:29:08 -06:00
Isuru Fernando
368274bcbd enable rdp 2018-11-21 00:29:08 -06:00
Isuru Fernando
707a5e7f9b No conda for mingw build 2018-11-21 00:29:08 -06:00
Isuru Fernando
65b0565c0a Check MinGW-w64 2018-11-21 00:29:08 -06:00
Isuru Fernando
9ddffba584 Fix MinGW build failure
Fixes https://github.com/flame/blis/issues/278
2018-11-21 00:23:34 -06:00
Field G. Van Zee
1d8aae220b Track internal scalar datatypes.
Details:
- Added a num_t datatype bitfield to the obj_t in the form of a new
  info2 field in the obj_t. This change was made primarily so that in
  the case of mixed-datatype gemm, the alpha scalar would not need to
  be cast to the storage datatype of B (or A) before then being cast to
  the computation datatype just before the macrokernel is called. This
  double-casting regime could result in loss of precision if the storage
  datatype of B (or A) is less than the computation precision. In
  practice, it was likely not going to be a big deal since most usage of
  alpha is for -1.0, 0.0, and 1.0 (or integer multiples thereof), which
  can all be represented exactly in single or double precision.
- The type of objbits_t was changed to uint32_t, so the new format
  potentially takes up the same space as the previous obj_t definition,
  assuming no padding inserted by the compiler. Shrinking info to 32
  bits and spilling over into a second field was chosen over using the
  high 32 bits of a single 64-bit objbits_t info field because many of
  the bitwise operations are performed with enums such as num_t, dom_t,
  and prec_t, which may take on the type of 32-bit ints. It's easier to
  just keep all of those bitwise operations in 32 bits than perform a
  million typecasts throughout bli_type_defs.h and bli_obj_macro_defs.h
  to ensure that the integers are treated as 64-bit for the purposes of
  the ANDs, ORs, and bitshifts.
- Many comment updates.
- Thanks to Devin Matthews and Devangi Parikh for their feedback and
  involvement during this commit cycle.
2018-11-20 18:42:07 -06:00
Field G. Van Zee
e769bf46b0 Tweak testsuite to issue FAIL for Nan, Inf (#279).
Details:
- Adjusted the definition for libblis_test_get_string_for_result() in
  testsuite/src/test_libblis.c so that the "FAIL" string is returned if
  the computed residual contains either NaN or Inf. Previously, a
  residual containing NaN would result in the selection of the "PASS"
  string. Thanks to Devin Matthews for reporting this issue (#279).
- Expounded on comment for the macro definitions of bli_isnan() and
  bli_isinf() in bli_misc_macro_defs.h to make it more obvious why they
  must remain macros.
2018-11-20 16:16:53 -06:00
Field G. Van Zee
279deae18f Added 4x5 matlab plotting scripts to test/3m4m.
Details:
- Added a new directory, test/3m4m/matlab, containing matlab scripts for
  plotting 4x5 panels of performance graphs (using the subplot()
  function) for gemm, hemm, herk, trmm, and trsm across all four
  floating-point datatypes. I expect to further refine these scripts as
  time goes on, but their current state constitutes a good start.
2018-11-16 11:34:19 -06:00
Field G. Van Zee
7b02c72665 CREDITS file update. 2018-11-14 13:49:55 -06:00
Field G. Van Zee
84dd298a27 Patch to fix msys2/Windows build failure (#277).
Details:
- Expanded cpp guard in frame/include/bli_x86_asm_macros.h to also check
  __MINGW32__ in addition to _WIN32, __clang__, and __MIC__. Thanks to
  Isuru Fernando for suggesting this fix, and also to Costas Yamin for
  originally reporting the issue (#277).
2018-11-14 13:47:45 -06:00
Field G. Van Zee
7b5ba7319b Merge branch 'dev' of github.com:flame/blis into dev 2018-11-14 12:32:01 -06:00
Field G. Van Zee
52392932dc Minor fixes to test/3m4m drivers.
Details:
- Cleanups to Makefile to allow all test drivers to be built for
  OpenBLAS and MKL in addition to BLIS.
- Fixed copy-paste typos in test_hemm in calls to ssymm_() and dsymm_().
- Fixed incorrect types for betap in BLAS cpp macro branch of
  test_herk.c.
2018-11-13 22:23:38 +00:00
Field G. Van Zee
4f12e36a0d Fixed number of columns in first output line.
Details:
- In previous commit, forgot to remove output column corresponding to
  the k dimension.
2018-11-13 14:23:12 -06:00
Field G. Van Zee
a2e0cdd7de Added hemm test driver to test/3m4m.
Details:
- Added a new test_hemm.c test driver to test/3m4m, which was modeled
  after the driver by the similar name in test. Also updated Makefile
  so that blis-nat-[sm]t would trigger builds for the new driver.
2018-11-13 14:15:11 -06:00
Field G. Van Zee
0f9b53e84b Fixed a bug in high-level mixeddt conditional.
Details:
- Fixed a bug in frame/3/bli_l3_oapi.c in the conditional that divides
  use of induced method (1m) execution from native execution. The former
  was intended to only be used in cases where all storage datatypes are
  complex and the datatype of C is equal to the computation datatype.
  (If mixed datatypes are detected, native execution would be used.)
  However, the code in bli_gemm() was erroneously checking the execution
  datatype instead of the computation datatype, which at that point is
  guaranteed to be equal to the storage datatype even if the computation
  datatype contains a different value. Thanks to Devangi Parikh for
  helping in isolating this bug.
2018-11-13 13:03:15 -06:00
Field G. Van Zee
ce719f816d More edits to mixeddt matlab scripts.
Details:
- Renamed scripts in test/mixeddt/matlab:
    plot_case_all.m -> plot_dom_all.m
    plot_case_md.m  -> plot_dom_case.m
    plot_all_md.m   -> plot_dt_all.m
- Added plot_dt_select.m in order to plot select graphs for the main
  body of the mixeddt paper, and added additional related legend
  handling in plot_gemm_perf.m.
- Added test/mixeddt/matlab/output and a .gitkeep file within in order
  to force git to recognize the directory.
2018-11-10 14:48:43 -06:00
Field G. Van Zee
bf99e7c14b Minor updates to test/mixeddt driver.
Details:
- Cleaned up test/mixeddt Makefile in preparation for gathering new
  data for mixeddt paper, including renaming implementations to
  "internal" and "ad-hoc" to match the terminology to be used in the
  paper.
- Added new matlab scripts for generating 8 figures, each covering all
  mixed-precision cases for each mixed-domain case.
- Updated the runme.sh script according to changes to Makefile.
- Fixed a minor bug in test_gemm.c that may have given incorrect
  performance in complex, homogeneous storage datatype cases where
  the computation precision was equal to the storage precisions.
  (Examples: zzzd, cccs.)
2018-11-08 18:47:17 -06:00
Field G. Van Zee
4bbb454bf3 Testsuite docs update for mixed-datatype gemm.
Details:
- Updated docs/Testsuite.md to include mention of the new mixed-domain
  and mixed-precision settings, including descriptions.
- Updated docs/MixedDatatypes.md to include a brief section on running
  the testsuite to exercise mixed-datatype functionality, which mostly
  amounts to a link to the Testsuite.md document.
- Minor verbiage change to testsuite output to correct a misleading
  label associated with the value returned by the query function
  bli_info_get_simd_num_registers(). (The function does not return the
  number of SIMD registers present in the hardware, but rather a maximum
  assumed value for the purposes of allocating temporary microtile
  workspace on the function stack.)
2018-11-03 19:11:01 -05:00
Field G. Van Zee
16401ae922 Merge branch 'dev' 2018-11-03 19:09:43 -05:00
Field G. Van Zee
2d403a1535 Merge pull request #275 from RhysU/patch-1
Spelling in FAQ
2018-11-01 20:18:53 -05:00
Rhys Ulerich
4a12979f65 Spelling in FAQ 2018-11-01 20:20:59 -04:00
Field G. Van Zee
f19c33af4c Disallow 64b BLAS integers + 32b BLIS integers.
Details:
- Print an error message from configure if the user attempts to
  explicitly configure BLIS for simultaneous use of 64-bit integers in
  the BLAS API with 32-bit integers in the BLIS API.
- Added cpp macro conditional to bli_type_defs.h to mandate that BLIS
  integers be 64 bits if the BLAS integers are 64 bits. This and the
  above item take care of issue #274. Thanks to Devin Matthews and
  Jeff Hammond for suggesting these safeguards.
- Slight reorganization and relabeling (for clarity) of BLAS/CBLAS
  sections and BLIS integer size line of the testsuite configuration
  output.
- Very minor edits to docs/MixedDatatypes.md.
2018-10-26 17:07:15 -05:00
Field G. Van Zee
e90e7f309b CHANGELOG update (0.5.0) 2018-10-25 14:09:43 -05:00