amd/blis - blis - Public git mirror

amd/blis

mirror of https://github.com/amd/blis.git synced 2026-04-19 15:18:52 +00:00

Author	SHA1	Message	Date
Edward Smyth	0d5c09d042	GTestSuite: Fix builds testing against MKL Correction to CMakeLists.txt to fix problem building executables when testing against MKL. AMD-Internal: [CPUPL-5928] Change-Id: Ie427fff0afb48be6ce6d940b1db2c9d1c7a40e5b	2024-11-14 14:57:44 +00:00
Edward Smyth	e9919761f7	GTestSuite: ILP64 build fix Cast literal 0 to match integer size in std::max tests. AMD-Internal: [CPUPL-4500] Change-Id: I330aafd8669884c5e1900b95742b5d1e4ce8ddfa	2024-11-14 14:57:44 +00:00
Eleni Vlachopoulou	d6a411d6b6	GTestSuite: Reorganizing some tests - Breaking tests to smaller executables. - Removing some redundant tests. AMD-Internal: [CPUPL-4500] Change-Id: I6288c3fcf5194ccb5de3485ca1ad95a20414208c	2024-10-02 11:48:18 -04:00
Eleni Vlachopoulou	72536e56ba	GTestSuite: Reducing gemm tests. Since there is thorough kernel testing, we reduce the number of "Black Box" test cases so that CI is faster. AMD-Internal: [CPUPL-4500] Change-Id: Ie57eeccff8103c0051eb1904162d6447da0ef102	2024-09-19 12:17:20 -04:00
Edward Smyth	6330ac6a52	GTestSuite: Misc changes - Correct matsize and NumericalComparison functions for tests with first matrix dimension <= 0. - BLAS1: - Fix for BLAS vs CBLAS differences in amaxv IIT_ERS tests. - Threshold adjustments in ddotxf and zaxpy. - Break axpyv and scalv into separate executables for each data type. - BLAS2: - Threshold adjustments in symv and hemv. - Break ger into separate executables for each data type. - UKR: - Break gemm and trsm ukr test into separate executables for each data type. - Threshold adjustments in daxpyf - Disable {z,c}trsm ukr tests when BLIS_INT_ELEMENT_TYPE is used, as matrix generator is not currently suitable for this. AMD-Internal: [CPUPL-4500] Change-Id: I1d9e7acc11025f1478b8b511c14def5517ef0ae6	2024-09-19 10:17:36 -04:00
Eleni Vlachopoulou	c7a5d04d4d	GTestSuite: Disabling falling tests. Those can be run in --gtest_also_run_disabled_tests is used. Bugs will be addressed and resolved in the future. AMD-Internal: [CPUPL-4500] Change-Id: I7a5443606ea8ef20f18ff8beec14bece5f6ee661	2024-09-18 13:12:35 +01:00
Edward Smyth	54f8fb951e	GTestSuite: BLAS2 test case selection Various changes to BLAS2 test cases: - GEMV: Reduce number of tests to make runtime more reasonable. - TRSV: - Standardize tests across different data types, including adding memory testing for all variants. - Improve scaling when making matrix A diagonally dominant and avoid singular matrix when BLIS_INT_ELEMENT_TYPE is used. - TRMV: Copy TRSV generic tests. - Expand set of tests for HEMV, HER, HER2, SYMV, SYR, SYR2 and make lda contribution to test names consistent with others routines. - Various adjustments to thresholds added. Update gtestsuite documentation to describe using GTEST_FILTER environment variable to select tests to run or exclude. This works particularly well when using ctest, as we do not enumerate all the tests at this level and so need to pass the selection down to the individual executables. AMD-Internal: [CPUPL-4500] Change-Id: Ifcb6410455b7f91e58b555f94b9fd7920d7ad9d9	2024-09-17 09:35:29 -04:00
Edward Smyth	61c6f1ad78	GTestSuite:a Fix alpha and beta input argument tests Check if alpha and beta are null before testing values. This avoids possible seg faults if alpha or beta have not been defined in IIT tests. AMD-Internal: [CPUPL-4500] Change-Id: Ibbf2d6a8fb38d9a95033f3fec3d06c3441e98689	2024-09-17 09:00:09 -04:00
Edward Smyth	8d4881c4fd	GTestSuite: add option to test blis_impl layer Add BLAS_TEST_IMPL option for TEST_INTERFACE to test the wrapper layer underneath BLAS and CBLAS interfaces. This is particularly useful if building a BLIS library with these interfaces disabled, e.g. ./configure --disable-blas amdzen or cmake . -DENABLE_BLAS=OFF -DBLIS_CONFIG_FAMILY=amdzen The ?_blis_impl wrappers should have the same arguments as the BLAS interfaces, thus we define TEST_BLAS_LIKE as an additional definition for convenience when selecting tests and options in the C++ files. AMD-Internal: [CPUPL-5650] Change-Id: I0275a387563f3efc2b40029950c8569956f2df7b	2024-09-16 09:53:56 -04:00
Edward Smyth	a07e041b1f	SCALV alpha=zero BLAS compliance SCALV is used directly by BLAS, CBLAS and BLIS scal{v} APIs but also within many other APIs to handle special cases. In general it is preferred to use SETV when alpha=0, but BLAS and CBLAS continue to multiple all vector element by alpha. This has different behaviour for propagating NaNs or Infs. Changes in this commit: - Standardize early returns from SCALV reference and optimized kernels. - User supplied N<0 is handled at the top level API layer. Use negative values of N in kernel calls to signify that SETV should _not_ be used when alpha=0. This should only be required in SCALV. - Include serial threshold in zdscal (as in dscal) to reduce overhead for small problem sizes. - Code tidying to make different variants more consistent. - More standardization of tests in SCALV gtestsuite programs. - Remove scalv_extreme_cases.cpp as it is now redundant. AMD-Internal: [CPUPL-4415] Change-Id: I42e98875ceaea224cc98d0cdfe0133c9abc3edae	2024-09-16 07:10:28 -04:00
Edward Smyth	3a6d367f9c	GTestSuite: Fix TRSM ukr tests in non-zen builds Add guards around bli_trsm_small kernel tests to only call them if BLIS_ENABLE_SMALL_MATRIX_TRSM is defined. This fixes missing symbol errors in tests of non-zen builds, e.g. generic or skx. AMD-Internal: [CPUPL-4500] Change-Id: I7a822a41b5f686b5e38b0c63dd1871963e990407	2024-08-21 07:45:06 -04:00
Chandrashekara K R	545f9ee44e	CMake: Updated cmake minimum version to be supported to 3.22.0 to maintain uniform across all AOCL libraries. AMD Internal : [CPUPL-5616] Change-Id: Ic53532ff9883b1bba39e859ea2523c20c1ac383b	2024-08-21 12:09:24 +05:30
Vignesh Balasubramanian	93631410a3	Bugfix : Fixed memory accesses in AVX512 SGEMMSUP RD kernels - Bug: Among the list of AVX512 SGEMMSUP RD kernels, the ones handling m_fringe = 3 had incorrect usage of ZMM on a vector-load instruction that strictly needed YMMs. - Further updated the existing micro-kernel test cases to simulate these issues and validate the fix. AMD-Internal: [CPUPL-5353] Change-Id: Id86e60ce36bb9f8433a1a203cfe0b8c6347df2c1	2024-08-19 17:18:31 +05:30
Arnav Sharma	a67c8f05fb	Gtestsuite: Fix for GEMM_COMPUTE IIT_ERS Test - The IIT_ERS test for GEMM_COMPUTE where alpha = 0 and beta = 0 was failing since neither of the matrices was being packed and thus, missing the scaling by alpha resulting in a non-zero output for C matrix (C := A * B). - Enabled packing of A matrix for the ZeroAlpha_ZeroBeta IIT_ERS test which handles the alpha scaling. AMD-Internal: [CPUPL-5598] Change-Id: Id9179ec6150d1bc5a0274edce727ce6cc4172213	2024-08-13 17:24:27 +05:30
Edward Smyth	7fff7b4026	Code cleanup: Miscellaneous fixes - Delete unused cmake files. - Add guards around call to bli_cpuid_is_avx2fma3_supported in frame/3/bli_l3_sup.c, currently assumes that non-x86 platforms will not use bli_gemmtsup. - Correct variable in frame/base/bli_arch.c on non-x86 builds. - Add guards around omp pragma to avoid possible gcc compiler warning in kernels/zen/2/bli_gemv_zen_int_4.c. - Add missing registers in clobber list in kernels/zen4/1/bli_dotv_zen_int_avx512.c. - Add gtestsuite ERS_IIT tests for TRMV, copied from TRSV. - Correct calls to cblas_{c,z}swap in gtestsuite. - Correct test name in ddotxf gtestsuite program. AMD-Internal: [CPUPL-4415] Change-Id: I69ad56390017676cc609b4d3aba3244a2df6a6b5	2024-08-06 06:56:01 -04:00
Edward Smyth	89f52a6df5	Code cleanup: spelling corrections Corrections for spelling and other mistakes in code comments and doc files. AMD-Internal: [CPUPL-4500] Change-Id: I33e28932b0e26bbed850c55602dee12fd002da7f	2024-08-05 16:18:51 -04:00
Edward Smyth	b964308e50	GTestSuite: option to check input arguments Add tests to check input arguments have not been modified by BLIS routine. These tests add a large runtime overhead, so they are disabled by default. To enable them, configure gtestsuite with: cmake -DTEST_INPUT_ARGS=ON ... and run desired tests as normal. Also: - Correct testinghelpers::chktrans to handle upper case values of argument trns. - Change testinghelpers::matsize to return size 0 if m, n or leading dimension are 0, or if leading dimension is too small. AMD-Internal: [CPUPL-4379] Change-Id: I9494af800f9383195272ce99f622104a38fd0ed8	2024-08-05 09:58:17 -04:00
Edward Smyth	6393cb9d7c	GTestSuite: misc corrections 3 - Set threshold to epsilon for early return cases where we are just scaling a matrix. - Add this threshold to IIT_ERS files for appropriate tests. - In IIT_ERS for gemm_compute, remove tests on null A and B when we are expecting to set or scale C. More thought is required in gemm_compute tests to handle these cases and look at cases where A or B has been packed. AMD-Internal: [CPUPL-4500] Change-Id: Ia649cc340ca1df6511388f9c43a31e53296cb2bf	2024-08-05 09:31:18 -04:00
Arnav Sharma	0a5c057475	DGEMV Optimizations for Tiny Sizes - Added reference kernel for dgemv that handles computation for tiny sizes (m < 8 && n < 8). - The reference kernel, bli_dgemv_zen_ref( ... ), supports both row/column storage schemes as well as transpose and no transpose cases. - Added additional unit-tests for functional verification. AMD-Internal: [CPUPL-5098] Change-Id: I66fdf0a40e90bdb3fed40152c45ab28a17a87ada	2024-08-05 12:19:42 +05:30
Ruchika Ashtankar	bdb94fb218	GTestSuite: Added tests for DGEMM SUP kernel - Added dgemmGenericSUP test for the new 24x8 DGEMM SUP kernel for zen5. AMD-Internal: [CPUPL-4404] Change-Id: I150ca310655a495bdcf5ea9d5a16746483a17b68	2024-08-02 11:37:29 -04:00
Edward Smyth	75f21182bd	GTestSuite: IIT and ERS test improvements Various improvements: - Where appropriate, test both: - with nullptr for suitable arguments that should never be touched. - with all arguments correct except the one we want to test, to check we are not returning early because another argument is a nullptr. - Test incorrect values for order argument in CBLAS calls. - Test early exits with limited data changes, e.g. set C to 0 or scale C in GEMM when alpha = 0. - Bugfix in gemmt test when alpha is 0 and beta is 1. - Use reference library gemmt for comparison when library is not netlib BLAS. AMD-Internal: [CPUPL-4500] Change-Id: Ibde7eaba5a484a87674044ca44855c6f6ee4ff4b	2024-07-31 15:36:01 -04:00
Edward Smyth	b90e12dfa4	GTestSuite: copyright notice Standardize format of copyright notice. AMD-Internal: [CPUPL-4500] Change-Id: I6bde64c15ff639492dd0de95423c660112a37e2c	2024-07-26 15:34:41 -04:00
Edward Smyth	ea286cf6f6	GTestSuite: whitespace at end of lines Unnecessary whitespace (spaces, tabs) at the end of lines has been removed. AMD-Internal: [CPUPL-4500] Change-Id: Ice5f5504232cb22460c14ac47e6a3a43309cba22	2024-07-26 12:12:56 -04:00
Edward Smyth	4183efa722	GTestSuite: No newline at end of file Add missing newline at the end of these files. AMD-Internal: [CPUPL-4500] Change-Id: I835cc73de0008b66ae3cf77fbb3daa1c8fcaaa7f	2024-07-26 11:42:57 -04:00
Edward Smyth	46fe3f3dcb	GTestSuite: dos2unix file conversion Source and other files in some directories were a mixture of Unix and DOS file formats. Convert all relevant files to Unix format for consistency. AMD-Internal: [CPUPL-4500] Change-Id: Ia3e479643b0bed4ae8a9107bde6e2cddf32d5bd8	2024-07-26 11:09:06 -04:00
Arnav Sharma	9583ee2e23	DGEMV Optimizations for NO_TRANSPOSE cases - Enabled AVX512 DAXPYF kernels for DGEMV var2 for NO_TRANSPOSE cases. - Added DAXPYF kernels with fuse factors of 2, 4, 6 and 16. - Added a wrapper for DAXPYF kernels for redirection to kernels with a smaller fuse factor than 32. - Also added UKR tests for the new fused kernels. AMD-Internal: [CPUPL-5098] Change-Id: I0b102b67c6c068873393bac0494284f379c253f2	2024-07-24 15:59:36 +05:30
Vignesh Balasubramanian	b48e864e82	AVX512 optimizations for DAXPBYV API - Implemented AVX512 computational kernel for DAXPBYV with optimal unrolling. Further implemented the other missing kernels that would be required to decompose the computation in special cases, namely the AVX512 DADDV and DSCAL2V kernels. - Updated the zen4 and zen5 contexts to ensure any query to acquire the kernel pointer for DAXPBYV returns the address of the new kernel. - Added micro-kernel units tests to GTestsuite to check for functionality and out-of-bounds reads and writes. AMD-Internal: [CPUPL-5406][CPUPL-5421] Change-Id: I127ab21174ddd9e6de2c30a320e62a8b042cbde6	2024-07-22 11:32:19 +05:30
Arnav Sharma	4aa66f108e	Added CSCALV AVX512 Kernel - Added CSCALV kernel utilizing the AVX512 ISA. - Added function pointers for the same to zen4 and zen5 contexts. - Updated the BLAS interface to invoke respective CSCALV kernels based on the architecture. - Added UKR tests for bli_cscalv_zen_int_avx512( ... ). AMD-Internal: [CPUPL-5299] Change-Id: I189d87a1ec1a6e30c16e05582dcb57a8510a27f3	2024-07-15 07:17:43 -04:00
vignbala	236d092656	AVX512 optimizations for ZGEMM to handle k = 1 cases - Implemented bli_zgemm_16x4_avx512_k1_nn( ... ) AVX512 kernel to be used as part of BLAS/CBLAS calls to ZGEMM. The kernel is built for handling the GEMM computation with inputs having k = 1, with the transpose values being N(for column-major) and T(for row-major). - Updated the zgemm_blis_impl( ... ) layer to query the architecture ID and invoke the AVX2 or AVX512 kernel accordingly. - Added API level tests for accuracy and code-coverage, as well as micro-kernel tests for verifying functionality and out-of-bounds memory accesses. AMD-Internal: [CPUPL-5249] Change-Id: Id1f8bebff3e0da83c7febe86299564fd658b2e84	2024-07-09 07:07:24 -04:00
Vignesh Balasubramanian	02da190560	AVX512 optimizations for DNRM2 - Implemented bli_dnorm2fv_unb_var1_avx512( ... ) AVX512 computational kernel for DNRM2 API. - Updated the header to include this kernel signature, as well as the framework layer to use this function in case of ZEN4 and ZEN5 configurations. - Updated the tipping points for ideal thread setting in DNRM2 for ZEN5 micro-architecture. These thresholds are specific to the library's linkage to LLVM's OpenMP or GNU's OpenMp. - Further abstracted the AOCL-DYNAMIC logic to separate functions for ?NRM2 APIs that currently support it(namely, DNRM2 and ZNRM2). - Further updated the ?NRM2 framework to accommodate the necessary changes to invoke the newer AOCL-DYNAMIC functions and the AVX512 kernel, when needed. - Added micro-kernel and memory tests for this kernel in GTestsuite, to validate accuracy and out-of-bounds read and write. AMD-Internal: [CPUPL-5265] Change-Id: I4fc0d0f1e6906bf27d46562ca387c338cc4d2049	2024-06-24 08:50:36 -04:00
Vignesh Balasubramanian	6165001658	Bugfix and optimizations for ?AXPBYV API - Updated the existing code-path for ?AXPBYV to reroute the inputs to the appropriate L1 kernel, based on the alpha and beta value. This is done in order to utilize sensible optimizations with regards to the compute and memory operations. - Updated the typed API interface for ?AXPBYV to include an early exit condition(when n is 0, or when alpha is 0 and beta is 1). Further updated this layer to query the right kernel from context, based on the input values of alpha and beta. - Added the necessary L1 vector kernels(i.e, ?SETV, ?ADDV, ?SCALV, ?SCAL2V and ?COPYV) to be used as part of special case handling in ?AXPBYV. - Moved the early return with negative increments from ?SCAL2V kernels to its typed API interface. - Updated the zen, zen2 and zen3 context to include function pointers for all these vector kernels. - Updated the existing ?AXPBYV vector kernels to handle only the required computation. Additional cleanup was done to these kernels. - Added accuracy and memory tests for AVX2 kernels of ?SETV ?COPYV, ?ADDV, ?SCALV, ?SCAL2V, ?AXPYV and ?AXPBYV APIs - Updated the existing thresholds in ?AXPBYV tests for complex types. This is due to the fact that every complex multiplication involves two mul ops and one add op. Further added test-cases for API level accuracy check, that includes special cases of alpha and beta. - Decomposed the reference call to ?AXPBYV with several other L1 BLAS APIs(in case of the reference not supporting its own ?AXPBYV API). The decomposition is done to match the exact operations that is done in BLIS based on alpha and/or beta values. This ensures that we test for our own compliance. AMD-Internal: [CPUPL-4861] Change-Id: Ia6d48f12f059f52b31c0bef6c75f47fd364952c6	2024-06-20 16:22:07 +05:30
Mangala V	90fe795c46	Gtestsuite: Enabled memory test for ZGEMM for k=0 AMD_Internal: [CPUPL-4657] Change-Id: Ic5f4d24184f05e0f57634845b4fb3312b3a416f6	2024-06-20 02:51:47 -04:00
Arnav Sharma	91bdf9a3eb	Gtestsuite: Bugfix for DOTXF, Changes to AXPYF - Fixed bug in ddotxf generic tests where the parameters lda_inc and inca were being read incorrectly. - Fixed bug in dotxf test wherein the y vector was being generated with length m instead of b. - Corrected function signatures to use type gtint_t instead of gint_t. - Updated the tests to use conjugate values of type char and convert to conj_t type only while invoking BLIS tests for both DOTXF and AXPYF. AMD-Internal: [CPUPL-5117] Change-Id: I0ef7af429057583a1cbf34827802e72401181caf	2024-06-07 15:05:10 +05:30
Edward Smyth	7829a7cf85	GTestSuite: test name consistency changes 6 Improve consistency in test names across different APIs: - Improve consistency of TEST_P part of test names. - Rename _evt_testing.cpp and nrm2_extreme.cpp files to _evt.cpp to match other APIs. - Standardize naming of IIT_ERS files. Also: - Restore trsv IIT_ERS file which was misnamed in commit `a2beef3255` - Tidy ukr gemm tests to be more consistent with each other and move threshold setting to individual TEST_P functions to allow different adjustments to be made. - Similarly make trsm tests more consistent. - Tidy naming of is_memory_test variable. AMD-Internal: [CPUPL-4500] Change-Id: I0af1fc9973b02187b19a7c2488eed1b829cfdc2f	2024-06-05 11:26:16 -04:00
Edward Smyth	d9c269786a	GTestSuite: bli_zdscalv isn't created by BLIS BLIS includes the BLAS and CBLAS interfaces for zdscal but not the BLIS typed interface bli_zdscalv. Thus, when TEST_INTERFACE=BLIS_TYPED is defined, disable tests for zdscal. AMD-Internal: [CPUPL-4671] Change-Id: I397454c83e272f9e775e37e00533002576041a93	2024-05-21 15:24:02 -04:00
Eleni Vlachopoulou	25bfd0a982	GTestSuite: Fix so that std::max to work properly on Windows. AMD-Internal: [CPUPL-4500] Change-Id: I73d55dd3040daf6f8aec94799cf7f3f0cc2bddc0	2024-05-20 15:59:16 +01:00
Edward Smyth	bc7d2df832	GTestSuite: misc corrections 2 - Correct value of alpha in ger ERS test. - rename ERS_IIT.cpp files to match naming convention used for other APIs. - Change all cases of gint_t to gtint_t except for dotxf, which is fixed in another commit. - Add TEST_UPPERCASE_ARGS to imatcopy and omatcopy{2} headers. - Corrected typo. AMD-Internal: [CPUPL-4500] Change-Id: I8844bb8c5941785e64daa9df5569092c19f91838	2024-05-20 03:51:21 -04:00
Eleni Vlachopoulou	e98d58b657	GTestSuite: Adjusting thresholds. -Adding multiplier for complex APIs. -Updating for trmv and trsv to reflect multiplication with alpha. AMD-Internal: [CPUPL-4500] Change-Id: I17361da5afa5d1e219b4c8a14542e2b216a7ea58	2024-05-17 09:11:59 -04:00
Edward Smyth	a69dc3669e	GTestSuite: test name consistency changes 5 Improve consistency in test names across different APIs. In this commit, standardize leading dimensions (lda, ldb, ldc) in test names. Also some misc tidying changes. AMD-Internal: [CPUPL-4500] Change-Id: Icbc82d0b9a3420ddfdb4f418396f9e56ab1765ab	2024-05-16 08:51:01 -04:00
Edward Smyth	782e009b66	GTestSuite: check data that should just be set is not read 2 Correction to commit `8657e661fc` to allocate matrix or vector correctly when special read-only case occurs. Also define a set_matrix generator for symmetric matrices to only set upper or lower triangle to the supplied value, while setting the unused elements to a large value to help catch incorrect access to those elements. AMD-Internal: [CPUPL-4548] Change-Id: I22b3a20e2ce8be70eb27179247cd47fdb2d87b9d	2024-05-15 11:56:16 -04:00
Edward Smyth	b2ed1000b3	GTestSuite: test name consistency changes 4 Improve consistency in test names across different APIs. Various changes in this patch: - Explicitly cast char variables to std::string when adding to test name. Adding the char directly was causing errors in name generation. - Use template version of print function in zdscalv and remove print function zdscalvGenericTestPrint. - Remove unused print function ztrsvPrint. - Eliminate some differences in gemm ukr print functions. - Remove extraneous API name labels in ukr axpyf and setv. - Make ukr/trsm/test_trsm_ukr.h more consistent with other files. AMD-Internal: [CPUPL-4500] Change-Id: Ib8092de216712586fe4ec0ae91698d0c1aaffd54	2024-05-13 11:11:02 -04:00
Edward Smyth	a94d2ddf44	GTestSuite: test name consistency changes 3 Improve consistency in test names across different APIs. In this commit, standardize storage, side, uplo, trans diag and conj in test names. AMD-Internal: [CPUPL-4500] Change-Id: Ifcdb6e9f684b134841d86087218d7aefd9cabe63	2024-05-10 08:35:19 -04:00
Edward Smyth	8657e661fc	GTestSuite: check data that should just be set is not read Some BLAS routines do not require matrices or vectors to be initialized in certain use cases. For example, in GEMM when beta=zero, C is set rather than updated, thus input values of C should not be used. In these cases set the inital values of such matrices or vectors to an extreme value, to help detect if these are incorrectly being read. The extreme value can be NaN or Inf. The default is Inf, change it by running cmake ... -DEXT_VALUE=NaN AMD-Internal: [CPUPL-4548] Change-Id: I4a665363779d2496b8247f6357e970b7f23cd1eb	2024-05-10 06:29:03 -04:00
Hari Govind S	92847ae912	Gtestsuite: Memory testing for SCOPYV, DCOPYV and ZCOPYV APIs - Utilized the memory testing feature in GTestsuite to update the testing interfaces for micro-kernel testing of SCOPY, DCOPY and ZCOPY APIs. Change-Id: I3d6905f33b000b8d5e60727aa896bd869f4f441f	2024-05-09 12:10:17 -04:00
vignbala	ca6276d52b	Accuracy and memory testing of AVX512 ?SETV, ZAXPYV and ZAXPYF kernels - Added accuracy and memory tests for AVX2 and AVX512 ?SETV kernels, AVX512 ZAXPYV kernel and AVX512 ZAXPYF kernels, with fuse-factors 2, 4 and 8. - Cleanup of the code-section that declares and defines the reference compute for AXPYF operation. Corrected the type mismatch with the arguments that reference AXPYV would expect(this is used to decompose AXPYF as part of reference). Ensured usage of GTestSuite's internal alias for integer types. - Updated the API level testsuite and testing interface for AXPYF, based on the cleaup done to the reference code. AMD-Internal: [CPUPL-4974] Change-Id: I71de6c09d3877cd3dd1eaa20ab4f90e7c33eb1e1	2024-05-09 00:24:02 -04:00
Edward Smyth	a2beef3255	GTestSuite: break up long running tests Test programs for key APIs like GEMM take a long time to run, and even to generate the list of test cases. Break into separate test programs for different data types to enable these to run in parallel (at gtest level). In this patch we break up GEMM, TRSM, GEMV and TRSV. AMD-Internal: [CPUPL-4500] Change-Id: I21363b050d30e0402d5a1e8cbeaed2ebcc87aaeb	2024-05-08 13:36:38 -04:00
Arnav Sharma	cb27fad49c	ZSCALV AVX512 Kernel - Implemented ZSCALV kernel utilizing AVX512 intrinsics. - Gtestsuite: Added ukr tests for the new kernel. AMD-Internal: [CPUPL-5012] Change-Id: I75c7f4448ddd60b0f9afa53936eed37f5f99eeb2	2024-05-08 11:55:13 -04:00
Arnav Sharma	89a06cf252	Gtestsuite: Unit Tests for ZDOTV AVX512 Kernel - Updated DOTV Gtestsuite interface to invoke C/ZDOTC when conjx='c' and testing interface is either BLAS or CBLAS. - Added ukr tests for bli_zdotv_zen4_asm_avx512( ... ) and bli_zdotv_zen_int_avx512( ... ) kernels. AMD-Internal: [CPUPL-5011] Change-Id: I32fb69027a35d9ea92f997a095d412c8242a4b68	2024-05-08 09:20:31 -04:00
eseswari	e0b172174e	Added testcases for axpyv api * Functional tests are covered for saxpyv and zaxpyv. * As part of functional large size of m, stride greater than m, scalar combinations(including special cases), Zero increment tests are added for saxpyv and zaxpyv. Signed-off-by: eseswari <sangadala.eswari@amd.com> AMD-Internal: CPUPL-4413 Change-Id: I61473357680cb0f394e6e653796ec31110895fa4	2024-05-08 08:44:45 -04:00
eseswari	dd10c6dc5b	Added testcases for copyv API * As part of functional test cases, large size of m, stride greater than m,scalar combinations, Zero increment tests are added for ?copyv. Signed-off-by: eseswari <sangadala.eswari@amd.com> AMD-Internal: CPUPL-4412 Change-Id: I9fa74c147975bbe21263aaf48190170c6ed0a8fd	2024-05-08 04:41:43 -04:00

1 2 3 4

168 Commits