amd/blis - blis - Public git mirror

amd/blis

mirror of https://github.com/amd/blis.git synced 2026-07-01 03:37:27 +00:00

Author	SHA1	Message	Date
jagar	89b143db07	GTestsuite:Search library in user specified-path In Gtestsuite CMakeLists.txt, find_library() will search user-mentioned library in default system paths first then in user specified paths. To avoid this CMake is updated to search the user mentioned library in user specified path and ignore searching in default path. AMD-Internal: [CPUPL-4284] Change-Id: Ia99cf59eb39deac4110d3d733f17548d432dde64	2023-12-11 15:40:53 +05:30
Edward Smyth	f44c649bb7	Code cleanup: AMD copyright notice Standardize format of AMD copyright notice. AMD-Internal: [CPUPL-3519] Change-Id: I98530e58138765e5cd5bc0c97500506801eb0bf0 (cherry picked from commit `ed5010d65b`)	2023-11-24 17:22:45 -05:00
Eleni Vlachopoulou	841c1067ed	GTestSuite: Clean-up on build system. - and a small bugfix so that it works again on Windows. Change-Id: I986b81d74d0f00c55eee497712aed5b268211d5f	2023-11-24 13:33:50 -05:00
mangala v	6ab76f52df	Gtestsuite: Updated sgemm testcase for sup Updated sgemm testcase to handle multiple values of alpha, beta for different input size Added sgemm testcase to cover m,n,k dimension till 20 size atleast instepsize of 1 Change-Id: Id10ba3d7a05154b171511ef11ea76297494672cd	2023-11-24 12:42:31 -05:00
Edward Smyth	5a88182c1e	Code cleanup: No newline at end of file Some text files were missing a newline at the end of the file. One has been added. AMD-Internal: [CPUPL-3519] Change-Id: I4b00876b1230b036723d6b56755c6ca844a7ffce (cherry picked from commit `f471615c66`)	2023-11-23 10:09:53 -05:00
mangala v	f02769e0ca	BugFix: Re-Designed SGEMM SUP kernel to use mask load/store instruction Segfault was reported through nightly jenkins job. Issue was observed when running in MT mode. Issue was due to extra broadcast being used. Extra broadcast would access out of bound memory on input buffer Cleaned up cobbler list by removing unused registers. AMD_Internal: [CPUPL-4180] Change-Id: I1c8715b2850ef855328f2ef12f215987299bdb2b	2023-11-22 04:55:30 -05:00
Edward Smyth	9500cbee63	Code cleanup: spelling corrections Corrections for some spelling mistakes in comments. AMD-Internal: [CPUPL-3519] Change-Id: I9a82518cde6476bc77fc3861a4b9f8729c6380ba	2023-11-09 00:16:30 -05:00
Arnav Sharma	dd1cf23090	Gtestsuite Update for Pack and Compute Extension APIs - Pack and compute are now compared against GEMM operation of reference library when MKL is not used as a reference. - For the case where both A and B are unpacked, the reference GEMM is invoked with a unit-alpha scalar. - If MKL is used as reference, then these APIs are compared against pack and compute operations of MKL. - Updated description in ref_gemm_compute.cpp to reflect this behavior. AMD-Internal: [CPUPL-4084] Change-Id: Id0521c9cad8743a7ae471a7f3c547ceb67191f86	2023-11-03 09:45:42 -04:00
Arnav Sharma	44dfc7a515	Fix for gemm_compute BLAS Check - BLAS compute checks updated to properly check for rs_c and cs_c. - Updated BLAS compute checks to skip validity check if m==1 or n==1. For the same reason, added a check just before to validate rs_c and cs_c are greater than or equal to 1. - Added tiny size tests to gtestsuite as a sanity check. - Also updated the Invalid Input Tests to test for the updated checks. AMD-Internal: [CPUPL-4140] Change-Id: I984339ec7909778b58409ffcdbeed4ee33f28cfb	2023-11-03 09:41:16 -04:00
Vignesh Balasubramanian	84faccdd7d	Enabling the vectorized path for SNRM2_ - Enabled the vectorized AVX-2 code-path for SNRM2_. The framework queries the architecture ID and calls the vectorized kernel based on the architecture support. - In case of not having the architecture support, we use the default path based on the sumsqv method. AMD-Internal: [CPUPL-3277] Change-Id: Ic60c0782dec0b7eb09fac21818eb625e57b1d14f	2023-11-03 17:45:56 +05:30
Arnav Sharma	c1612f6838	Gtestsuite Framework and Unit Tests for Pack and Compute Extension APIs - Added framework for unit testing of BLAS and CBLAS interfaces for the Pack and Compute Extension APIs. - These test the integrated functionality of the trio of ?gemm_pack_get_size(), ?gemm_pack() and ?gemm_compute() APIs. - Note: Only MKL can be used as reference for now. AMD-Internal: [CPUPL-3560] Change-Id: I801654447a716da06c9ccf9db01d553817871571	2023-10-16 09:35:42 -04:00
Vignesh Balasubramanian	81161066e5	Multithreading the DNRM2 and DZNRM2 API - Updated the bli_dnormfv_unb_var1( ... ) and bli_znormfv_unb_var1( ... ) function to support multithreaded calls to the respective computational kernels, if and when the OpenMP support is enabled. - Added the logic to distribute the job among the threads such that only one thread has to deal with fringe case(if required). The remaining threads will execute only the AVX-2 code section of the computational kernel. - Added reduction logic post parallel region, to handle overflow and/or underflow conditions as per the mandate. The reduction for both the APIs involve calling the vectorized kernel of dnormfv operation. - Added changes to the kernel to have the scaling factors and thresholds prebroadcasted onto the registers, instead of broadcasting every time on a need basis. - Non-unit stride cases are packed to be redirected to the vectorized implementation. In case the packing fails, the input is handled by the fringe case loop in the kernel. - Added the SSE implementation in bli_dnorm2fv_unb_var1_avx2( ... ) and bli_dznorm2fv_unb_var1_avx2( ... ) kernels, to handle fringe cases of size = 2 ( and ) size = 1 or non-unit strides respectively. AMD-Internal: [CPUPL-3916][CPUPL-3633] Change-Id: Ib9131568d4c048b7e5f2b82526145622a5e8f93d	2023-10-16 07:26:27 -04:00
Harsh Dave	7a4f84fbac	Optimized dgemm for tiny input sizes. - This commit focused on enhancing the performance of dgemm for matrices for very small dimenstions. - blis_dgemm_tiny function re-uses dgemm sup kernels, bypassing the conventional SUP framework code path. As SUP framework code path requires the creation and initilization of blis objects, accessing all the needed meta-information from objects, querying contexts which adds performance penaulty while computing for matrices with very small dimensions. - To avoid such performance penaulty blis_dgemm_tiny function implements a lightweight support code so that it can re-use dgemm SUP kernels such a way that it directly operates on input buffers. It avoids framework overhead of creating and intializing blis objects, context intialization, accessing other large framework data structures. - blis_dgemm_tiny function checks for threshold condition to match before picking the kernel. For zen, zen2, zen3 architecture tiny kernel is invoked for any shape as long as m < 8 and k <= 1500 or m < 1000 and n <= 24 and k <=1500. While for zen4 as long as dimensions are less than 1500 for m,n,k tiny kernel is invoked. -blis_dgemm_tiny function supports single threaded computation as of now. AMD-Internal: [CPUPL-3574] Change-Id: Ife66d35b51add4fccbeebd29911e0c957e59a05f	2023-10-16 05:52:49 -04:00
Vignesh Balasubramanian	a6a67fea2d	ZAXPBYV optimizations for handling unit and non-unit strides - Updated the bli_zaxpbyv_zen_int( ... ) kernel's computational logic. The kernel performs two different sets of compute based on the value of alpha, for both unit and non-unit strides. There are no constraints on beta scaling of the 'y' vector. - Updated the logic to support 'x' conjugate in the computation. The kernel supports conjugate/no conjugate operation through the usage of _mm256_fmsubadd_pd( ... ) and _mm256_addsub_pd( ... ) intrinsics. - Updated the early return condition in the kernel to adhere to the standard compliance. - Updated the scalar computation with vector computation(using 128 bit registers), in case of dealing with a single element(fringe case) in unit-stride or vectors with non-unit strides. A single dcomplex element occupies 128 bits in memory, thereby providing scope for this optimization. - Added accuracy and extreme value testing with sufficient sizes and initializations, to test the required main and fringe cases of the computation. AMD-Internal: [CPUPL-3623] Change-Id: I7ae918856e7aba49424162290f3e3d592c244826	2023-10-12 06:31:08 -04:00
jagar	5d578684ea	GtestSuite: Update in source code to make it compatible on MSVC(windows) AMD-Internal: [CPUPL-2732] Change-Id: Ifd9372bf9b0f00c2bf24442ea8519bfcf4e5db5b	2023-10-09 04:43:29 -04:00
jagar	712a84d50f	Gtestsuite: Update in cmake to search reflib in given path AMD-Internal: [CPUPL-2732] Change-Id: Ide2b98a95f81f394c7c01cc3a3b5ae6fa0403a82	2023-10-05 05:39:27 -04:00
jagar	29711dd5a3	Gtestsuite: Updated testings_basics.* to print matrix/vector name AMD-Internal: [CPUPL-2732] Change-Id: I89b4ffc97ea852e66f42b82058af67c16144fbf6	2023-09-26 08:27:19 -04:00
Vignesh Balasubramanian	32104c400c	GTestSuite : Designing test cases for ZGEMM - Designed test cases for unit testing of ZGEMM compute kernel for handling inputs when k == 1. The design uses value-parameterized testing for checking accuracy, and verifying the mandate in case of exception values on the inputs/output. - The design uses type-parameterized testing for verifying BLAS standard for invalid input cases, and also for early return scenarios. - Added the function template set_ev_mat( ... ) as part of testinghelpers. This function is used as a helper for inducing exception values onto indices specified as arguments to the test_gemm( ... ) interface. - Abstracted the function definition of getValueString( ... ) from the NRM2 testing interface to testinghelpers(renamed as get_value_string( ... ) for naming consistency), in order to use it as a helper function across all APIs in case of exception value testing. AMD-Internal: [CPUPL-3823] Change-Id: I0fea21f9c8759bbbdc88ba0a016202753e28f2a7	2023-09-08 17:36:57 +05:30
Eleni Vlachopoulou	a6641dec0b	Updating GTestSuite CMake system to enable testing BLIS libraries on Windows. - Renaming ELEMENT_TYPE to BLIS_ELEMENT_TYPE, since the first is defined on a Windows header. - Updating refCBLAS object to have different implementation depending on the platform. - Removing dlfcn.h from all reference headers since it's linux specific and adding it conditionally on a higher level. - Changes on all CMakeLists.txt files to enable building on Windows. AMD-Internal: [CPUPL-2732] Change-Id: I6e35656a3779b35dc815a2409cf84c22dd27f3e7	2023-08-29 16:11:22 +05:30
Eleni Vlachopoulou	fa77d0415a	Updating nrm2 GTestSuite testing - Adding default template parameter for the type of the returned value from nrm2. - Bugfix on NaN/Inf comparator for scalars. - Tuning sizes of vector x to exercise the different paths for vectorized and scalar code. - Adding wrong parameters and extreme value testing. - Adding tests for overflow and underflow using max and min representable numbers for vectorized and scalar code. AMD-Internal: [CPUPL-2732] Change-Id: Ice8ee65095ecaa7b30ebd5f90ed2a890178533db	2023-07-28 05:03:00 -04:00
jagar	fb6f1380b2	Gtestsuite:Added util functions - Functions to print matrix and vector elements. - Functions to convert matrix to symmetric, hermitian triangular matrix and set diagonal elements in matrix. AMD-Internal: [CPUPL-2732] Change-Id: I1ffa5289329cbb8a9581bf545bdd157801cf5baa	2023-06-27 16:33:57 +05:30
jagar	003d1e9ae6	GTestSuite: Using ELEMENT_TYPE to specify generation of random numbers in tests. Since random numbers are specified from ELEMENT_TYPE and we never generate tests for both integer and floating point numbers at the same time, we update code as described below: - random vector/matrix generators are updated to use ELEMENT_TYPE as a default parameter. - ::testing::Values(ELEMENT_TYPE) is removed from all test generators. AMD-Internal: [CPUPL-2732] Change-Id: Ibc6b05044502f541c9e8a7687931b1ca2903fb0c	2023-06-21 11:30:15 -04:00
Edward Smyth	7e50ba669b	Code cleanup: No newline at end of file Some text files were missing a newline at the end of the file. One has been added. Also correct file format of windows/tests/inputs.yaml, which was missed in commit `0f0277e104` AMD-Internal: [CPUPL-2870] Change-Id: Icb83a4a27033dc0ff325cb84a1cf399e953ec549	2023-04-21 10:02:48 -04:00
Edward Smyth	0f0277e104	Code cleanup: dos2unix file conversion Source and other files in some directories were a mixture of Unix and DOS file formats. Convert all relevant files to Unix format for consistency. Some Windows-specific files remain in DOS format. AMD-Internal: [CPUPL-2870] Change-Id: Ic9a0fddb2dba6dc8bcf0ad9b3cc93774a46caeeb	2023-04-21 08:41:16 -04:00
Eleni Vlachopoulou	ea484f38e6	BLIS GTestSuite fixes for ILP64. - Adding doc regarding option setting for INT64 in README. - Bugfix on template instantiation on helper function. Updated to use gtint_t instead of int. AMD-Internal: [CPUPL-2732] Change-Id: Ia52407a1ef3fdd06e905c2e3d4aa5befb80e82d6	2023-04-19 03:41:55 -04:00
jagar	a77402968c	GTestsuite: Updates in CmakeLists.txt to check libraries Updated the CmakeLists.txt to check whether the specified libraries are present or abort cmake building AMD-Internal: [CPUPL-2732] Change-Id: I90115217c228430095aa53a82dc26d16935b320f	2023-04-14 08:56:41 -04:00
jagar	f164c7fe70	Added GTestSuite helper functions - Functions to convert to cblas enums from char. - Functions to print matrix and vector elements. - Functions to set matrix and vector elements with the given value. AMD-Internal: [CPUPL-2732] Change-Id: I1046b9578c8456e89eddba4a4e8577016b9361ca	2023-04-12 09:03:08 -04:00
Eleni Vlachopoulou	e8392fedb8	GTestSuite fix on trsm tests. - Fixing thresholds to be more appropriate. - Updating the way random entries of A and B are generated so that A is diagonally dominant and the algorithm doesn't diverge. AMD-Internal: [CPUPL-2732] Change-Id: I6d5691d744ecc623f66c45e94461bd88625d7179	2023-04-11 20:01:21 +05:30
jagar	1d5c1e5803	Code coverage support in gtestsuite framework - Tools used for code coverage are : Gcov and Lcov. - We need to use macros specified by gcov during compiliation of blis and gtestsuite. - Locv will generate coverage reports in html format. AMD-Internal: [CPUPL-2732] Change-Id: I17b30b4a322b8771f2d6a4ba28986cf0ccf3fba6	2023-04-10 07:48:15 -04:00
Eleni Vlachopoulou	fa024b82ad	Adding helper functionality for wrong input testing in GTestSuite. - Added a header with correct default values to be used in tests. - Updated README to include information on how to test for wrong parameters and some explanation on how lda increments work. AMD-Internal: [CPUPL-2732] Change-Id: I4f540d46013ffe91b4acb30da2b437251c09d3bc	2023-04-06 13:32:29 -04:00
Eleni Vlachopoulou	bf3f5cafa8	BLIS GTestSuite Updates: - Fix in README.md. - Updating abs overload for scomplex and dcomplex to avoid overflow by using std::abs. - Updating comparators to take into account NaNs and Infs when measuring error. AMD-Internal: [CPUPL-2732] Change-Id: I8c12bacd9d63b2e914d0a79f337f7525dc16b733	2023-04-05 06:11:34 -05:00
jagar	f9adfa8ee4	Updated CmakeLists.txt to remove cmake generated files cmake generated files and executables are cleaned within build directory by "make distclean" command. Change-Id: I4fd5193e92958122ff10ecc634b42096f3b3716e	2023-04-05 06:11:16 -04:00
Eleni Vlachopoulou	58f85bb8f1	Adding copyright notice in gtestsuite files. Change-Id: I5097831eb7a46c56a4a2a32da4d3ee69c8b36cb5	2023-03-29 09:01:48 -04:00
Eleni Vlachopoulou	04e091fdca	BLIS GTestSuite: Link OpenMP if we test serial BLIS, but MKL is used as a reference. Change-Id: Iacafa5ecf74622fa5e1180a81305cf7a23d79055	2023-03-28 04:43:58 -04:00
Eleni Vlachopoulou	155a64e734	Introducing upgrated BLIS GTestSuite. Key features: - able to test both static and dynamic libraries - able to test BLAS, CBLAS and BLIS-typed interface - can use any CBLAS library for reference results - can build and/or run tests depending on the BLAS level or a specific API AMD-Internal: [CPUPL-2732] Change-Id: Ibe0d7938e06081526bbc54d3182ac7d17affdaf6	2023-03-21 03:17:51 +05:30
Eleni Vlachopoulou	88e549e7bd	Using CMake as the build system for both Linux and Windows: - GoogleTest headers removed. GoogleTest gets fetched at configuration time. - BLIS headers removed. A BLIS installation path is required at configuration time. - Windows has been temporarily disabled. AMD-Internal: [CPUPL-2732] Change-Id: I9e55c8e43b2733f96cd8b6e5449d79623decad5c	2022-12-13 19:09:23 +05:30
jagar	cff29bde76	Added gtestsuite folder into blis repo Moved blis gtestsuite from lib-confscript to blis repo (branch: amd-main) Change-Id: If7ad391eef66bac6d26cf5223e6043d52b746072	2022-12-07 23:57:13 -05:00

37 Commits