amd/blis - blis - Public git mirror

amd/blis

mirror of https://github.com/amd/blis.git synced 2026-05-11 09:39:59 +00:00

Author	SHA1	Message	Date
Field G. Van Zee	0645f239fb	Remove UT-Austin from copyright headers' clause 3. Details: - Removed explicit reference to The University of Texas at Austin in the third clause of the license comment blocks of all relevant files and replaced it with a more all-encompassing "copyright holder(s)". - Removed duplicate words ("derived") from a few kernels' license comment blocks. - Homogenized license comment block in kernels/zen/3/bli_gemm_small.c with format of all other comment blocks.	2018-12-04 14:31:06 -06:00
Field G. Van Zee	9b688a2d69	Refer to color mm algorithm in Multithreading.md.	2018-12-04 13:30:25 -06:00
Field G. Van Zee	22384fd2b7	Minor updates to test_gemm.c in test/mixeddt.	2018-12-04 13:09:04 -06:00
Field G. Van Zee	2ba3b1780c	Removed symbols from libblis-symbols.def. Details: - Removed bli_gemm_md_front() and bli_gemm_md_zgemm() symbols from build/libblis-symbols.def, which will hopefully appease AppVeyor.	2018-12-03 19:40:39 -06:00
Field G. Van Zee	dcb38c4e59	Merge branch 'dev'	2018-12-03 18:06:19 -06:00
Field G. Van Zee	375eb30b0a	Added mixed-precision support to 1m method. Details: - Lifted the constraint that 1m only be used when all operands' storage datatypes (along with the computation datatype) are equal. Now, 1m may be used as long as all operands are stored in the complex domain. This change largely consisted of adding the ability to pack to 1e and 1r formats from one precision to another. It also required adding logic for handling complex values of alpha to bli_packm_blk_var1_md() (similar to the logic in bli_packm_blk_var1()). - Fixed a bug in several virtual microkernels (bli_gemm_md_c2r_ref.c, bli_gemm1m_ref.c, and bli_gemmtrsm1m_ref.c) that resulted in the wrong ukernel output preference field being read. Previously, the preference for the native complex ukernel was being read instead of the pref for the native real domain ukernel. This bug would not manifest if the preference for the native complex ukernel happened to be equal to that of the native real ukernel. - Added support for testing mixed-precision 1m execution via the gemm module of the testsuite. - Tweaked/simplified bli_gemm_front() and bli_gemm_md.c so that pack schemas are always read from the context, rather than trying to sometimes embed them directly to the A and B objects. (They are still embedded, but now uniformly only after reading the schemas from the context.) - Redefined cpp macro bli_l3_ind_recast_1m_params() as a static function and renamed to bli_gemm_ind_recast_1m_params() (since gemm is the only consumer). - Added 1m optimization logic (via bli_gemm_ind_recast_1m_params()) to bli_gemm_ker_var2_md(). - Added explicit handling for beta == 1 and beta == 0 in the reference gemm1m virtual microkernel in ref_kernels/ind/bli_gemm1m_ref.c. - Rewrote various level-0 macro defs, including axpyris, axpbyris, scal2ris, and xpbyris (and their conjugating counterparts) to explicitly support three operand types and updated invocations to xpbyris in bli_gemmtrsm1m_ref.c. - Query and use the storage datatype of the packed object instead of the storage datatype of the source object in bli_packm_blk_var1(). - Relocated and renamed frame/ind/misc/bli_l3_ind_opt.h to frame/3/gemm/ind/bli_gemm_ind_opt.h. - Various whitespace/comment updates.	2018-12-03 17:49:52 -06:00
Field G. Van Zee	dc18409551	CREDITS file update.	2018-11-28 11:58:40 -06:00
Field G. Van Zee	ee4d271296	Merge pull request #287 from SuperFluffy/fix_configuration_links Fix configuration links	2018-11-28 11:52:57 -06:00
Richard Janis Goldschmidt	3d7e8bc3b8	Fix configuration links	2018-11-28 15:56:37 +01:00
Field G. Van Zee	6a4885f8be	Merge branch 'master' into dev	2018-11-27 13:22:59 -06:00
Field G. Van Zee	e81c4b5666	Merge pull request #285 from isuruf/pthread Move LDFLAGS to the end	2018-11-21 17:00:49 -06:00
Isuru Fernando	cfbdb58de2	Move LDFLAGS to the end Otherwise the linker will drop flags like -lpthread	2018-11-21 14:23:39 -06:00
Field G. Van Zee	757043eae8	Merge pull request #283 from isuruf/patch-3 Fix MinGW and Cygwin build failures	2018-11-21 13:07:26 -06:00
Isuru Fernando	7af8fa0137	Fix blis dll path	2018-11-21 02:10:05 -06:00
Isuru Fernando	2acd8dcd23	Fix install path of dll.a	2018-11-21 02:02:18 -06:00
Isuru Fernando	b7b0ad22b1	Test mingw	2018-11-21 01:54:44 -06:00
Isuru Fernando	bafe521ed0	Fixes for mingw	2018-11-21 01:54:36 -06:00
Isuru Fernando	be831879bd	test gcc shared	2018-11-21 01:39:32 -06:00
Isuru Fernando	f6b924648c	Don't use .def for gcc	2018-11-21 01:39:19 -06:00
Isuru Fernando	ce6e4eae6d	test no threading	2018-11-21 01:34:56 -06:00
Isuru Fernando	c9169b4685	Add mingw64 path	2018-11-21 01:17:36 -06:00
Isuru Fernando	0f753090ea	Fix PATH	2018-11-21 01:14:52 -06:00
Isuru Fernando	d424470b1f	Check openmp and pthreads threading	2018-11-21 01:04:26 -06:00
Isuru Fernando	c73e7601e5	Revert "enable rdp" This reverts commit `368274bcbd`.	2018-11-21 00:50:33 -06:00
Isuru Fernando	6209b2e606	Remove conda	2018-11-21 00:50:22 -06:00
Isuru Fernando	0b1b344447	Fix make name	2018-11-21 00:42:39 -06:00
Isuru Fernando	7a9838983b	Use m2w64-make	2018-11-21 00:35:27 -06:00
Isuru Fernando	4c1dedd6a9	No activate on gcc	2018-11-21 00:29:08 -06:00
Isuru Fernando	368274bcbd	enable rdp	2018-11-21 00:29:08 -06:00
Isuru Fernando	707a5e7f9b	No conda for mingw build	2018-11-21 00:29:08 -06:00
Isuru Fernando	65b0565c0a	Check MinGW-w64	2018-11-21 00:29:08 -06:00
Isuru Fernando	9ddffba584	Fix MinGW build failure Fixes https://github.com/flame/blis/issues/278	2018-11-21 00:23:34 -06:00
Field G. Van Zee	1d8aae220b	Track internal scalar datatypes. Details: - Added a num_t datatype bitfield to the obj_t in the form of a new info2 field in the obj_t. This change was made primarily so that in the case of mixed-datatype gemm, the alpha scalar would not need to be cast to the storage datatype of B (or A) before then being cast to the computation datatype just before the macrokernel is called. This double-casting regime could result in loss of precision if the storage datatype of B (or A) is less than the computation precision. In practice, it was likely not going to be a big deal since most usage of alpha is for -1.0, 0.0, and 1.0 (or integer multiples thereof), which can all be represented exactly in single or double precision. - The type of objbits_t was changed to uint32_t, so the new format potentially takes up the same space as the previous obj_t definition, assuming no padding inserted by the compiler. Shrinking info to 32 bits and spilling over into a second field was chosen over using the high 32 bits of a single 64-bit objbits_t info field because many of the bitwise operations are performed with enums such as num_t, dom_t, and prec_t, which may take on the type of 32-bit ints. It's easier to just keep all of those bitwise operations in 32 bits than perform a million typecasts throughout bli_type_defs.h and bli_obj_macro_defs.h to ensure that the integers are treated as 64-bit for the purposes of the ANDs, ORs, and bitshifts. - Many comment updates. - Thanks to Devin Matthews and Devangi Parikh for their feedback and involvement during this commit cycle.	2018-11-20 18:42:07 -06:00
Field G. Van Zee	e769bf46b0	Tweak testsuite to issue FAIL for Nan, Inf (#279 ). Details: - Adjusted the definition for libblis_test_get_string_for_result() in testsuite/src/test_libblis.c so that the "FAIL" string is returned if the computed residual contains either NaN or Inf. Previously, a residual containing NaN would result in the selection of the "PASS" string. Thanks to Devin Matthews for reporting this issue (#279). - Expounded on comment for the macro definitions of bli_isnan() and bli_isinf() in bli_misc_macro_defs.h to make it more obvious why they must remain macros.	2018-11-20 16:16:53 -06:00
Field G. Van Zee	279deae18f	Added 4x5 matlab plotting scripts to test/3m4m. Details: - Added a new directory, test/3m4m/matlab, containing matlab scripts for plotting 4x5 panels of performance graphs (using the subplot() function) for gemm, hemm, herk, trmm, and trsm across all four floating-point datatypes. I expect to further refine these scripts as time goes on, but their current state constitutes a good start.	2018-11-16 11:34:19 -06:00
Field G. Van Zee	7b02c72665	CREDITS file update.	2018-11-14 13:49:55 -06:00
Field G. Van Zee	84dd298a27	Patch to fix msys2/Windows build failure (#277 ). Details: - Expanded cpp guard in frame/include/bli_x86_asm_macros.h to also check __MINGW32__ in addition to _WIN32, __clang__, and __MIC__. Thanks to Isuru Fernando for suggesting this fix, and also to Costas Yamin for originally reporting the issue (#277).	2018-11-14 13:47:45 -06:00
Field G. Van Zee	7b5ba7319b	Merge branch 'dev' of github.com:flame/blis into dev	2018-11-14 12:32:01 -06:00
Field G. Van Zee	52392932dc	Minor fixes to test/3m4m drivers. Details: - Cleanups to Makefile to allow all test drivers to be built for OpenBLAS and MKL in addition to BLIS. - Fixed copy-paste typos in test_hemm in calls to ssymm_() and dsymm_(). - Fixed incorrect types for betap in BLAS cpp macro branch of test_herk.c.	2018-11-13 22:23:38 +00:00
Field G. Van Zee	4f12e36a0d	Fixed number of columns in first output line. Details: - In previous commit, forgot to remove output column corresponding to the k dimension.	2018-11-13 14:23:12 -06:00
Field G. Van Zee	a2e0cdd7de	Added hemm test driver to test/3m4m. Details: - Added a new test_hemm.c test driver to test/3m4m, which was modeled after the driver by the similar name in test. Also updated Makefile so that blis-nat-[sm]t would trigger builds for the new driver.	2018-11-13 14:15:11 -06:00
Field G. Van Zee	0f9b53e84b	Fixed a bug in high-level mixeddt conditional. Details: - Fixed a bug in frame/3/bli_l3_oapi.c in the conditional that divides use of induced method (1m) execution from native execution. The former was intended to only be used in cases where all storage datatypes are complex and the datatype of C is equal to the computation datatype. (If mixed datatypes are detected, native execution would be used.) However, the code in bli_gemm() was erroneously checking the execution datatype instead of the computation datatype, which at that point is guaranteed to be equal to the storage datatype even if the computation datatype contains a different value. Thanks to Devangi Parikh for helping in isolating this bug.	2018-11-13 13:03:15 -06:00
Field G. Van Zee	ce719f816d	More edits to mixeddt matlab scripts. Details: - Renamed scripts in test/mixeddt/matlab: plot_case_all.m -> plot_dom_all.m plot_case_md.m -> plot_dom_case.m plot_all_md.m -> plot_dt_all.m - Added plot_dt_select.m in order to plot select graphs for the main body of the mixeddt paper, and added additional related legend handling in plot_gemm_perf.m. - Added test/mixeddt/matlab/output and a .gitkeep file within in order to force git to recognize the directory.	2018-11-10 14:48:43 -06:00
Field G. Van Zee	bf99e7c14b	Minor updates to test/mixeddt driver. Details: - Cleaned up test/mixeddt Makefile in preparation for gathering new data for mixeddt paper, including renaming implementations to "internal" and "ad-hoc" to match the terminology to be used in the paper. - Added new matlab scripts for generating 8 figures, each covering all mixed-precision cases for each mixed-domain case. - Updated the runme.sh script according to changes to Makefile. - Fixed a minor bug in test_gemm.c that may have given incorrect performance in complex, homogeneous storage datatype cases where the computation precision was equal to the storage precisions. (Examples: zzzd, cccs.)	2018-11-08 18:47:17 -06:00
Field G. Van Zee	4bbb454bf3	Testsuite docs update for mixed-datatype gemm. Details: - Updated docs/Testsuite.md to include mention of the new mixed-domain and mixed-precision settings, including descriptions. - Updated docs/MixedDatatypes.md to include a brief section on running the testsuite to exercise mixed-datatype functionality, which mostly amounts to a link to the Testsuite.md document. - Minor verbiage change to testsuite output to correct a misleading label associated with the value returned by the query function bli_info_get_simd_num_registers(). (The function does not return the number of SIMD registers present in the hardware, but rather a maximum assumed value for the purposes of allocating temporary microtile workspace on the function stack.)	2018-11-03 19:11:01 -05:00
Field G. Van Zee	16401ae922	Merge branch 'dev'	2018-11-03 19:09:43 -05:00
Field G. Van Zee	2d403a1535	Merge pull request #275 from RhysU/patch-1 Spelling in FAQ	2018-11-01 20:18:53 -05:00
Rhys Ulerich	4a12979f65	Spelling in FAQ	2018-11-01 20:20:59 -04:00
Field G. Van Zee	f19c33af4c	Disallow 64b BLAS integers + 32b BLIS integers. Details: - Print an error message from configure if the user attempts to explicitly configure BLIS for simultaneous use of 64-bit integers in the BLAS API with 32-bit integers in the BLIS API. - Added cpp macro conditional to bli_type_defs.h to mandate that BLIS integers be 64 bits if the BLAS integers are 64 bits. This and the above item take care of issue #274. Thanks to Devin Matthews and Jeff Hammond for suggesting these safeguards. - Slight reorganization and relabeling (for clarity) of BLAS/CBLAS sections and BLIS integer size line of the testsuite configuration output. - Very minor edits to docs/MixedDatatypes.md.	2018-10-26 17:07:15 -05:00
Field G. Van Zee	e90e7f309b	CHANGELOG update (0.5.0)	2018-10-25 14:09:43 -05:00

1 2 3 4 5 ...

1537 Commits