Commit Graph

3291 Commits

Author SHA1 Message Date
Edward Smyth
a2beef3255 GTestSuite: break up long running tests
Test programs for key APIs like GEMM take a long time to run,
and even to generate the list of test cases. Break into
separate test programs for different data types to enable
these to run in parallel (at gtest level). In this patch
we break up GEMM, TRSM, GEMV and TRSV.

AMD-Internal: [CPUPL-4500]
Change-Id: I21363b050d30e0402d5a1e8cbeaed2ebcc87aaeb
2024-05-08 13:36:38 -04:00
Edward Smyth
62c886feee Export some BLIS internal symbols
AOCL libFLAME optimizations directly call some internal
BLIS symbols. Export them to enable this to work with
the BLIS shared library.

AMD-Internal: [CPUPL-5044]
Change-Id: Icb62dcb51e12d72dde8434593ab17de3c227c93d
2024-05-08 12:51:32 -04:00
Arnav Sharma
cb27fad49c ZSCALV AVX512 Kernel
- Implemented ZSCALV kernel utilizing AVX512 intrinsics.

- Gtestsuite: Added ukr tests for the new kernel.

AMD-Internal: [CPUPL-5012]
Change-Id: I75c7f4448ddd60b0f9afa53936eed37f5f99eeb2
2024-05-08 11:55:13 -04:00
Arnav Sharma
89a06cf252 Gtestsuite: Unit Tests for ZDOTV AVX512 Kernel
- Updated DOTV Gtestsuite interface to invoke C/ZDOTC when conjx='c'
  and testing interface is either BLAS or CBLAS.

- Added ukr tests for bli_zdotv_zen4_asm_avx512( ... ) and
  bli_zdotv_zen_int_avx512( ... ) kernels.

AMD-Internal: [CPUPL-5011]
Change-Id: I32fb69027a35d9ea92f997a095d412c8242a4b68
2024-05-08 09:20:31 -04:00
eseswari
e0b172174e Added testcases for axpyv api
* Functional tests are covered for saxpyv and zaxpyv.
* As part of functional large size of m, stride greater than m, scalar
  combinations(including special cases), Zero increment tests are
  added for saxpyv and zaxpyv.

Signed-off-by: eseswari <sangadala.eswari@amd.com>
AMD-Internal: CPUPL-4413
Change-Id: I61473357680cb0f394e6e653796ec31110895fa4
2024-05-08 08:44:45 -04:00
Arnav Sharma
1dbeee4d19 ZDOTV AVX512 Kernel with MT Support
- Added AVX512 kernel for ZDOTV.

- Multithreaded both ZDOTC and ZDOTU with AOCL_DYNAMIC support.

AMD-Internal: [CPUPL-5011]
Change-Id: I56df9c07ab3b8df06267a99835b088dcada81bd8
2024-05-08 04:54:05 -04:00
eseswari
dd10c6dc5b Added testcases for copyv API
* As part of functional test cases, large size of m, stride greater than
  m,scalar combinations, Zero increment tests are added for ?copyv.

Signed-off-by: eseswari <sangadala.eswari@amd.com>
AMD-Internal: CPUPL-4412
Change-Id: I9fa74c147975bbe21263aaf48190170c6ed0a8fd
2024-05-08 04:41:43 -04:00
Eleni Vlachopoulou
7787d5af1a GTestSuite: Updating CMake system to create executables depending on the directory structure.
- Before the system was assuming 3 levels in the directory structure and
  was creating corresponding targets.
- Now the system looks into the subdirectories of testsuite and creates
  a target for each subdirectory that has at least one cpp file.
- Also deleted a directory that seems duplicate and was breaking builds.

AMD-Internal: [CPUPL-4500]
Change-Id: I03ca362b09783f1c7c5f37ab420d8ca2c2b45e2e
2024-05-08 03:46:14 -04:00
Arnav Sharma
b1d69180f9 Updated DOTV DTL in bla_dot.c
- Updated DOTV DTL entry to include conjugate parameter.

AMD-Internal: [CPUPL-5059]
Change-Id: Id66be02fc06ff2faa18325dffe76559af2c6a5cf
2024-05-08 01:46:17 -04:00
Mangala V
e6cc2a3e22 ZGEMMT SUP Optimizations for AVX512
Existing Design:
 - GEMM AVX2 kernel performs computation and updates temporary C buffer
 - Portion of temporary C buffer is copied to output C buffer
   based on UPLO parameter
 - For diagonal blocks, using GEMM kernels is not efficient

New Design: Implemented in current patch when UPLO='L'
 - GEMMT kernel used for computation, temporary buffer is not required.
 - Only required elements are computed using mask load store for all
   fringe cases
 - Exception: AVX2 code path is used when storage format is RRC, CRR, CRC

- AOCL-Dynamic is added based on dimension
- Check for AVX platform is added in SUP interface, It returns to
  native implementation if hardware doesnot support AVX platform
- SUP ref_var2m is expanded for dcomplex datatype to avoid condition
  check which exists for double datatype

AMD_Internal: [CPUPL-5006]

Change-Id: I3e21404b732b8f2df9cbdba394303752fdf36286
2024-05-07 23:00:29 +05:30
Meghana Vankadari
1072770c63 Implemented LPGEMV for bf16 datatype
1. The 5 LOOP LPGEMM path is in-efficient when A or B is a vector
   (i.e, m == 1 or n == 1).

2. An efficient implementation is developed considering the b matrix
   reorder in case of m=1 and post-ops fusion.

3. When m = 1 the algorithm divide the GEMM workload in n dimension
   intelligently at a granularity of NR. Each thread work on A:1xk
   B:kx(>=NR) and produce C=1x(>NR).  K is unrolled by 4 along with
   remainder loop.

4. When n = 1 the algorithm divide the GEMM workload in m dimension
   intelligently at a granularity of MR. Each thread work on A:(>=MR)xk
   B:kx1 and produce C = (>=MR)x1. When n=1 reordering of B is avoided
   to efficiently process in n one kernel.

AMD-Internal: [SWLCSG-2355]
Change-Id: I7497dad4c293587cbc171a5998b9f2817a4db880
2024-05-06 23:55:15 +05:30
Kiran Varaganti
fd61c69778 Fixed bug in omatcopy for when trans="t"
Thanks to Zhenyu Zhu ajz34 for pointing out this bug.
When trans="t" or "conjugate transpose" in the case of complex data-types
the ldb should be greater than equal to cols.
In the bug it was checked against "rows". Fixed this bug.
Some minor code format is done.

[CPUPL-4810][SWLCSG-2706]

Change-Id: Ie796d25a361b2ba72eda80e8c5867d6352af901f
2024-05-06 12:57:38 -04:00
Shubham Sharma
be34169001 Fixed Matlab Failure in ZTRSM
- In AVX512 ZTRSM kernel, vertorizes division code
  is causing failures in matlab.
- The logic is identical in reference C code and intrinsics code,
  but intrinsics code is causing failure
- Replaced optimized intrinsics code with C code.

AMD-Internal: [CPUPL-5052]
Change-Id: Iea184330b22c46d979867b870486066ef980eb84
2024-05-06 06:56:45 -04:00
mkadavil
118e955a22 SWISH post-op support for all LPGEMM APIs.
SWISH post-op computes swish(x) = x / (1 + exp(-1 * alpha * x)).
SiLU = SWISH with alpha = 1.

AMD-Internal: [SWLCSG-2387]
Change-Id: I55f50c74a8583a515f7ea58fa0878ccbcdd6cc26
2024-05-06 06:05:11 -04:00
Meghana Vankadari
75b9d46a40 Fix in LPGEMM for variable BLIS-int size
- Modified all structs that are passed to JIT-generated code to use
  integer of type uint64_t rather than dim_t so that functionality
  is not affected when size of BLIS-internal integer is modified
  during configure time.

Change-Id: Ib81c088072badf13da4ca73be2d4af4551b713d8
2024-05-06 02:56:47 -04:00
Shubham Sharma
7553abad8e Fixed compilation error with AOCC in TRSV
- Added a {} around zen4 switch case to avoid AOCC error.
- Error is caused because in C declarations are not a statement, therefore
  they cannot be labled hence compiler is not able to create a lable
  for jump.

AMD-Internal: [CPUPL-4880]
Change-Id: Icfeedafd80bf9a955e430ca967b6a93dcbbf075e
2024-05-03 21:08:38 +05:30
vignbala
f8218bb9f2 Compiler warnings when using masked loads
- Updated the AVX512 DOTXF kernels to use MASKZ loads
  instead of MASK loads when loading X vector in fringe
  case. This avoids compiler warnings of uninitialized
  vector as input to the intrinsic.

- The functionality will not change when using either MASK
  or MASKZ loads on X, since A matrix is loaded using MASKZ
  loads.

AMD-Internal: [CPUPL-4974]
Change-Id: I1ef98a1292352d0e905cc09cd5667acd883df827
2024-05-03 09:53:36 -04:00
Edward Smyth
0a830626b2 GTestSuite: check stored value of INFO
Check internal value of INFO for BLAS2 and BLAS3 routines
using the bli_info_get_info_value() function added in AOCL 4.2.
If testing a BLIS library that does not have this, use

cmake ... -DCAN_TEST_INFO_VALUE=OFF

AMD-Internal: [CPUPL-4993]
Change-Id: Ida5d252b0f6727793ebfb74bb160e8cb96b61b74
2024-05-03 09:08:21 -04:00
Shubham Sharma
b70347d0d4 DGEMMT SUP Optimizations for AVX512
- In DGEMMT SUP AVX2 code path, traingular kernels
  are added in order to avoid temporary C buffer.
- Since these kernels did not exist for AVX512,
  AVX2 kernels were being used in GEMMT.
- AVX512 triangular GEMM kernel has been added
  to make sure that AVX512 kernels can be used without
  creating a temporary buffer.
- This kernel is added only for Lower variant of GEMMT,
   for upper variant of DGEMMT, temporary C buffer is
   created, full GEMM kernel is called on temporary C and
   traingular region from temporary C is copied to C
   buffer.

AMD-Internal: [CPUPL-4881]
Change-Id: Id70645f79ae078ab9a7006e83d328505f1fae8a9
2024-05-03 05:11:11 -04:00
Shubham Sharma
b9e21e8701 Added ZTRSM AVX512 small code path
- Kernel dimensions are 4x4.
  - Two kernels are implemented, Right Upper and
    Right lower.
  - In case of Left variants of TRSM, transpose is
    induced so that Right variant kernels can be used.
  - No packing is performed in these kernels.
  - Changes are made in the threshold to pick ZTRSM small
    code path.
  - BLIS_INLINE is removed from signature of
    "TRSMSMALL_KER_PROT".
  - These kernels do not support "ENABLE_TRSM_PREINVERSION".
  - Newly added kernels do not support conjugate
    transpose.
  - Added multithreading to ZTRSM small code path.

AMD-Internal: [CPUPL-4324]
Change-Id: I683b1d5239593e54f433e7f27497d72dfbd9141c
2024-05-03 05:10:41 -04:00
Shubham Sharma
1d983e6124 Added AVX512 kernels for DAXPYF and DDOTXF
- Added DAXPYF and DDOTXF AVX512 kernels.
- Fuse factor for ddotxf kernel is 8.
- 2 DAXPYF kernels are added, with fuse
  factor 8 and 32.
- Multithreading is also added to the DAXPYf
  kernel with fuse factor 32.
- These kernels are internally used by TRSM.
- Added changes in TRSV to call these kernels
  in ZEN4

AMD-Internal: [CPUPL-4880]
Change-Id: I12850de974b437bbca07677b68bc3d6a35858770
2024-05-03 05:10:22 -04:00
Vignesh Balasubramanian
4e2966f9b0 AVX512 optimizations for ZGEMV API with transpose case
- Implemented AVX512 kernels for handling the calls to ZGEMV
  with transpose to A matrix.

- This includes the set of ZDOTXF and ZDOTXV kernels. ZDOTXF
  kernels include those with fuse-factor 8 (main kernel), 4
  and 2(fringe kernels).

- Updated the bli_zgemv_unf_var1( ... ) function to update
  the function pointers to these kernels, based on the
  configuration.

AMD-Internal: [CPUPL-4974]
Change-Id: I313ae0abe9dc119de849da42f9825b71f11b1fda
2024-05-03 04:38:52 -04:00
Vignesh Balasubramanian
53cb83d0cc AVX512 optimizations for ZGEMV API with no-transpose case
- Implemented AVX512 kernels for handling the calls to ZGEMV
  with no-transpose to A matrix.

- This includes the ZAXPYF, ZAXPYV and ZSETV kernels.
  The set of ZAXPYF kernels include those with fuse-factor 8
  (main kernel), 4 and 2(fringe kernels).

- Updated the bli_zgemv_unf_var2( ... ) function to set
  the function pointers to these kernels, based on the
  configuration. Further added the call to ZSETV at this
  layer in case beta is 0.

AMD-Internal: [CPUPL-4974]
Change-Id: Iee4b724719e49023138bb16479765be44d677cd9
2024-05-03 07:04:47 +00:00
Eleni Vlachopoulou
edbbbe8791 GTestSuite: Templatizing printing function for test name.
- Using a template class for the printing operator that depends
  on the type.
- USe a macro to denote which interface is being tested.

AMD-Internal: [CPUPL-4500]

Change-Id: I453c4ef4842c354064f49ff32ec4bf42920cc17c
2024-05-02 12:00:17 -04:00
Edward Smyth
82e628b833 GTestSuite: seg faults in data generator
Following a recent change to the data generators to allow a stride
to be specified (60cc23f3d3), seg
faults can occur if m<=0 for column storage or n<=0 for row storage.
Prevent this by having separarate code paths to handle these
scenarios.

AMD-Internal: [CPUPL-4500]
Change-Id: I23ed8b2dccaaca140e2ddfda45bcdb4c888d5708
2024-05-01 05:46:52 -04:00
Edward Smyth
25c15bb471 GTestSuite: test name consistency changes 2
Improve consistency in test names across different APIs.

In this commit, standardize m, n, k and b in test names.

AMD-Internal: [CPUPL-4500]
Change-Id: I53e7dd83cbf426ab1ebe8aa4af1da01594f4af23
2024-05-01 04:55:58 -04:00
Hari Govind
9c26de1a18 Optimisiation COPYV APIs
- Implemented AVX512 kernels for scopyv_, dcopyv_ and  zcopyv_
  using respective AVX512 intrinsics including masked
  load and store operations.

- Implemented AVX512 kernels for scopy_, dcopy_ and
  zcopy_ using assembly language to prevent loss of
  performance during the translation of intrinsics.

- Updated the dcopy_blis_impl( ... ) and
  zcopy_blis_impl( ... ) function to support
  multithreaded calls to the respective computational
  kernels, if and when the OpenMP support is enabled.

- Implemented OpenMP parallelization for dcopyv_ and
  zcopyv_ APIs, while scopyv_ and ccopyv_ only support
  single thread.

AMD-Internal: [CPUPL-4854]
Change-Id: I5fbd0bcca4e59001fbe2b1168b624d0c33242b3e
2024-05-01 00:23:01 +05:30
vignbala
b55c86cce7 GTestSuite : Cleanups to ensure proper build of GTestSuite
- Updated the IIT_ERS tests for SUBV to avoid using undefined
  variables. These tests are enabled only when GTestSuite is
  configured for BLIS_TYPED interface testing.

- Updated an instantiator in DAXPBY accuracy tests, to avoid
  parsing error(extra comma). These tests are enabled only when
  GTestSuite is configured for BLIS_TYPED interface.

AMD-Internal: [CPUPL-4500]
Change-Id: If6894daadbbc353dd66968649642ff07fa663782
2024-04-30 09:15:43 +00:00
srigovin
2c838dadfb Updated return type of xerbla and xerbla_array APIs to void
Return type of xerbla and xerbla_array APIs are defined as int in BLIS, but according to netlib it should be void. Updated the defination and declaration accordingly.

Signed-off-by: Sridhar Govindaswamy <Sridhar.Govindaswamy@amd.com>
Change-Id: I3072ba76111189de5c5cf08df83ea154163dd34d
2024-04-29 00:51:10 -04:00
Edward Smyth
f4612238b4 GTestSuite: test name consistency changes 1
First in a series of commits to improve consistency in test names
across different APIs. This will help with gtest filtering.

In this commit, standardize alpha, beta, incx and incy.

AMD-Internal: [CPUPL-4500]
Change-Id: I0cde85f9a4cf969c0b12ac589b232786ad011f09
2024-04-26 07:13:59 -04:00
Meghana Vankadari
ceee4b7818 Fix in DGEMMSUP for cases where C matrix is row-major.
Details:
- variable m0 is being loaded into a register without typecasting
  it to uint64_t. This resulted in seg-fault when int size is set
  to be 32 bits during configure time.
- Any variable that is loaded using mov in assembly needs to be
  typecasted to uint64_t before begin_asm, so that change in size
  of integer doesn't affect the functionality.
- Modified all instances using variable m0 to use variable 'm' where
  m = (uint64_t)m0;

AMD-Internal: [CPUPL-4971]
Change-Id: I49b66d2cacf19ace40ab44c9f85904644e8921f4
2024-04-25 13:07:23 -04:00
Vignesh Balasubramanian
29ae28dd8f GTestSuite: Additional fix for GEMV
- Updated test_gemv.h to pass the right boolean
  to computediff( ... ), based on whether we run
  it for exception value tests or not.

AMD-Internal: [CPUPL-4500]
Change-Id: I1ad2cde4f9b4bb1dadc32d1f7d02a90a457e218f
2024-04-25 06:35:31 -04:00
eseswari
34422757fa Added testcases for GER API :
*covered large sizes, scalar combinations and strides greater than the
size for cger, dger, sger and zger.
Signed-off-by: Sangadala Eswari <Sangadala.Eswari@amd.com>
AMD-Internal: CPUPL-4414
Change-Id: I6fba26a35903d1f6dbd713f19eac6bb537b3d8d2
2024-04-25 03:17:21 -04:00
Vignesh Balasubramanian
7bd87e3057 GTestSuite: Fixes for IMATCOPY and GEMV
- Changed the macro guard for accuracy tests of SIMATCOPY,
  to ensure that tests are enabled/disabled based on the reference.

- Updated test_gemv.h to make sure the contents of y vector is copied
  to y_ref post inducing exception values.

AMD-Internal: [CPUPL-4500]
Change-Id: I7249e643677e7e493eba5d072567615bc913a532
2024-04-24 19:36:59 +05:30
Edward Smyth
1ef7fb428a GTestSuite: print name of variable in error messages
Add name of variable being tested in error output from
computediff functions. First step to adding (optional)
tests on input arguments.

AMD-Internal: [CPUPL-4379]
Change-Id: I9553b660bcf5ecf1dd675cb837655078933455ac
2024-04-23 11:14:08 -04:00
Edward Smyth
7bb82eee6e GTestSuite: BLAS1 thresholds
Modify thresholds to reflect number of operations that
accumulate results into each output element. Different
limits are set for early return and special cases.

Constants are still subject to experimentation and change.

AMD-Internal: [CPUPL-4378]
Change-Id: I81f63a36c161ff1866f2d404b9e3cbb9a2948d3a
2024-04-23 05:57:31 -04:00
Edward Smyth
bcae225517 GTestSuite: BLAS3 thresholds
Modify thresholds to reflect number of operations that
accumulate results into each output element. Different
limits are set for early return and special cases.

Constants are still subject to experimentation and change.

AMD-Internal: [CPUPL-4378]
Change-Id: I03cd8901e574f2e44e85ce8b0bc234e36edb4819
2024-04-22 13:25:23 -04:00
Edward Smyth
ccf3910209 BLIS: bli_cpuid.c incorrectly selecting zen5 on zen4 hardware
Correct the order of tests in bli_cpuid.c to test all known zen
AVX512 platforms before considering fallback tests on AVX512
support. This avoids builds with "configure auto" or
"cmake -DBLIS_CONFIG_FAMILY=auto" incorrectly selecting zen5
sub-configuration on zen4 systems.

AMD-Internal: [CPUPL-4966]
Change-Id: I8706382e2df7c9ae4bb456e3a7f465053e15beea
2024-04-22 03:26:06 -04:00
jagar
e52de030a6 Gtestsuite : Fixing issue on Windows OS
1. Fixed issue related to linking reference library.
2. Clean-up of how reference library variables are set.
2. Compilation error related to std::max() and std::min().

AMD-Internal: [CPUPL-4879]
Change-Id: I427a4a4c0ea56a340a8bbd1a6649252e9680b937
2024-04-19 11:20:25 +00:00
Arnav Sharma
b293a29fb4 Gtestsuite: Memory and Extreme Value Tests for GEMV
- Added Memory Access Test support for GEMV.

- Added Extreme Value Tests for various combinations of NaN, Inf and
  -Inf for ?GEMV.

- Also fixed some invalid IIT_ERS tests.

AMD-Internal: [CPUPL-4825]
Change-Id: Iee77b305f6c6b9427153fbbc5191176dae9fbfea
2024-04-16 09:57:12 -04:00
Shubham Sharma
14bab0eb17 Fixed out of bounds read in CTRSM small kernel
- In 2x1 fringe case in [RUN/RLT] kernel, 3 scomplex
  precision numbers are being read instead of 1 scomplex.

- Fixed the code to read only one scomplex.

AMD-Internal: [CPUPL-4403]
Change-Id: If3ac03ed864618382d3a382a8cdff7ff8a94eb7d
2024-04-16 02:42:34 -04:00
Shubham Sharma
632c32767b Avoid alpha scaling in ZTRSV/ZTRSM when alpha = 1
- Scaling vector X is skipped when alpha is 1 in ZTRSV.
- Scaling matrix A is skipped when alpha is 1 in ZTRSM.

AMD-Internal: [CPUPL-4324]
Change-Id: I03c5a454ed1f5be36dac0f121408749bfc9cfc81
2024-04-16 02:24:02 -04:00
Shubham Sharma
ea010c5dc2 Improve perf of bli_obj_equals for 1x1 matrices
- Comparision using bli_eqsc is slower than direct comparison.
- Changed comparision logic for 1x1 matrix
   from bli_sqsc to direct comparision.

AMD-Internal: [CPUPL-4324]
Change-Id: Ifb2d0ad7a97c8bf33b66d624a7ecc53e38c1c803
2024-04-16 00:43:28 -04:00
Edward Smyth
c51b4628b4 BLIS: Implement zen5 sub-configuration in cmake
Correction to commit 2450a1813b
to add -DBLIS_CONFIG_FAMILY=zen5 support in cmake.

AMD-Internal: [CPUPL-3518]
Change-Id: Iecff2b64d5d95960cecbbf98d5269133747b122e
2024-04-15 07:40:50 -04:00
Edward Smyth
2450a1813b BLIS: Implement zen5 sub-configuration
Implement full support for zen5 as a separate BLIS sub-configuration
and code path within amdzen configuration family.

AMD-Internal: [CPUPL-3518]
Change-Id: Iaa5096e0b83bf0f0c3fd1c41e601ccd29bda3c09
2024-04-12 07:26:31 -04:00
Harish
13211119e4 Level2 GEMV gtest for below tests is implemented for all data types
1. Different matrix sizes
2. Different Stride values and Scalar values
3. Added Early Return tests in new file
Signed-off by: Harish Kumar<harish.kumar@amd.com>
AMD-Internal: [CPUPL-4417]

Change-Id: I5e645612808336e11da0c5ed8da9fe17a5543fbd
2024-04-08 14:46:59 -04:00
Vignesh Balasubramanian
1b7980a38d Added support to benchmark AXPYV APIs
- Implemented the feature to benchmark ?AXPYV APIs
  for the supported datatypes. The feature allows to
  benchmark BLAS, CBLAS or the native BLIS API, based
  on the macro definition.

- Added a sample input file to provide examples to benchmark
  AXPYV for all its datatype supports.

- Updated the sample input file for SCALV to provide examples
  to benchmark all of its datatype supports.

AMD-Internal: [CPUPL-4805]
Change-Id: I550920e3a57fcc2e4900e9e698330d8b8595bdee
2024-04-08 00:06:54 -04:00
Edward Smyth
c2d4f1d7a5 GTestSuite: Avoid infinite recursion in generators
Previous commit introduced an infinite recursion problem in
generators for symmetric matrices. This was reported as a
compiler warning by gcc 12.2 but not by gcc 11.4.

AMD-Internal: [CPUPL-4862]
Change-Id: I8642b81a62f0643b5a9ebedb4fcc83b25542de1b
2024-04-04 19:46:18 +05:30
Vignesh Balasubramanian
60cc23f3d3 Test-case development for ?IMATCOPY and ?OMATCOPY2 APIs
- Added test-cases to verify the functional behaviour
  of the BLAS-extension API ?imatcopy_() and ?omatcopy2_().
  The test-cases cover the following categories for the
  supported datatypes :
  - Functional and memory testing.
  - Negative parameter testing with invalid inputs.
  - Early return scenarios.
  - Exception value testing.

- Updated functions in testinghelpers to include strides
  in addition to leading-dimension, when initializing
  a matrix. The default value for stride is set as 1.

- Implemented functions to load the reference symbol, based
  on the choice of the reference library. The function definition
  is overloaded due to different API standards being exposed by
  different libraries.

- Code cleanup of files for ?OMATCOPY API.

AMD-Internal: [CPUPL-4862]
Change-Id: If63b348f517e2cde1fe48f3a195808b33a91c312
2024-04-04 16:26:20 +05:30
Arnav Sharma
f71495a135 Support for DOTC in DOTV Bench and DTL updates
- Added support for ?DOTC in bench.

- Updated DTL to accept conjx as a parameter:
    - 'N', i.e., no conjugate for DOTU
    - 'C', i.e., conjugate for DOTC

- Updated DTL calls in the interface with respective values of
  conjx.

AMD-Internal: [CPUPL-4804]
Change-Id: I447b19a6273566c6021c1721ce173bac4a59142c
2024-04-04 12:27:53 +05:30