Unnecessary whitespace (spaces, tabs) at the end of lines
has been removed.
AMD-Internal: [CPUPL-4500]
Change-Id: Ice5f5504232cb22460c14ac47e6a3a43309cba22
Source and other files in some directories were a mixture of
Unix and DOS file formats. Convert all relevant files to Unix
format for consistency.
AMD-Internal: [CPUPL-4500]
Change-Id: Ia3e479643b0bed4ae8a9107bde6e2cddf32d5bd8
- Enabled AVX512 DAXPYF kernels for DGEMV var2 for NO_TRANSPOSE cases.
- Added DAXPYF kernels with fuse factors of 2, 4, 6 and 16.
- Added a wrapper for DAXPYF kernels for redirection to kernels with a
smaller fuse factor than 32.
- Also added UKR tests for the new fused kernels.
AMD-Internal: [CPUPL-5098]
Change-Id: I0b102b67c6c068873393bac0494284f379c253f2
- Implemented AVX512 computational kernel for DAXPBYV
with optimal unrolling. Further implemented the other
missing kernels that would be required to decompose
the computation in special cases, namely the AVX512
DADDV and DSCAL2V kernels.
- Updated the zen4 and zen5 contexts to ensure any query
to acquire the kernel pointer for DAXPBYV returns the
address of the new kernel.
- Added micro-kernel units tests to GTestsuite to check
for functionality and out-of-bounds reads and writes.
AMD-Internal: [CPUPL-5406][CPUPL-5421]
Change-Id: I127ab21174ddd9e6de2c30a320e62a8b042cbde6
- Added CSCALV kernel utilizing the AVX512 ISA.
- Added function pointers for the same to zen4 and zen5 contexts.
- Updated the BLAS interface to invoke respective CSCALV kernels based
on the architecture.
- Added UKR tests for bli_cscalv_zen_int_avx512( ... ).
AMD-Internal: [CPUPL-5299]
Change-Id: I189d87a1ec1a6e30c16e05582dcb57a8510a27f3
- Implemented bli_zgemm_16x4_avx512_k1_nn( ... ) AVX512 kernel to
be used as part of BLAS/CBLAS calls to ZGEMM. The kernel is built
for handling the GEMM computation with inputs having k = 1,
with the transpose values being N(for column-major) and T(for
row-major).
- Updated the zgemm_blis_impl( ... ) layer to query the architecture
ID and invoke the AVX2 or AVX512 kernel accordingly.
- Added API level tests for accuracy and code-coverage, as well as
micro-kernel tests for verifying functionality and out-of-bounds
memory accesses.
AMD-Internal: [CPUPL-5249]
Change-Id: Id1f8bebff3e0da83c7febe86299564fd658b2e84
- Implemented bli_dnorm2fv_unb_var1_avx512( ... ) AVX512
computational kernel for DNRM2 API.
- Updated the header to include this kernel signature, as well
as the framework layer to use this function in case of ZEN4
and ZEN5 configurations.
- Updated the tipping points for ideal thread setting in DNRM2
for ZEN5 micro-architecture. These thresholds are specific
to the library's linkage to LLVM's OpenMP or GNU's OpenMp.
- Further abstracted the AOCL-DYNAMIC logic to separate functions
for ?NRM2 APIs that currently support it(namely, DNRM2 and ZNRM2).
- Further updated the ?NRM2 framework to accommodate the necessary
changes to invoke the newer AOCL-DYNAMIC functions and the AVX512
kernel, when needed.
- Added micro-kernel and memory tests for this kernel in GTestsuite,
to validate accuracy and out-of-bounds read and write.
AMD-Internal: [CPUPL-5265]
Change-Id: I4fc0d0f1e6906bf27d46562ca387c338cc4d2049
- Updated the existing code-path for ?AXPBYV to
reroute the inputs to the appropriate L1 kernel,
based on the alpha and beta value. This is done
in order to utilize sensible optimizations with
regards to the compute and memory operations.
- Updated the typed API interface for ?AXPBYV to include
an early exit condition(when n is 0, or when alpha is
0 and beta is 1). Further updated this layer to query
the right kernel from context, based on the input values
of alpha and beta.
- Added the necessary L1 vector kernels(i.e, ?SETV, ?ADDV,
?SCALV, ?SCAL2V and ?COPYV) to be used as part of special
case handling in ?AXPBYV.
- Moved the early return with negative increments from ?SCAL2V
kernels to its typed API interface.
- Updated the zen, zen2 and zen3 context to include function
pointers for all these vector kernels.
- Updated the existing ?AXPBYV vector kernels to handle only
the required computation. Additional cleanup was done to
these kernels.
- Added accuracy and memory tests for AVX2 kernels of ?SETV
?COPYV, ?ADDV, ?SCALV, ?SCAL2V, ?AXPYV and ?AXPBYV APIs
- Updated the existing thresholds in ?AXPBYV tests for complex
types. This is due to the fact that every complex multiplication
involves two mul ops and one add op. Further added test-cases
for API level accuracy check, that includes special cases of
alpha and beta.
- Decomposed the reference call to ?AXPBYV with several other
L1 BLAS APIs(in case of the reference not supporting its own
?AXPBYV API). The decomposition is done to match the exact
operations that is done in BLIS based on alpha and/or beta
values. This ensures that we test for our own compliance.
AMD-Internal: [CPUPL-4861]
Change-Id: Ia6d48f12f059f52b31c0bef6c75f47fd364952c6
- Fixed bug in ddotxf generic tests where the parameters lda_inc and
inca were being read incorrectly.
- Fixed bug in dotxf test wherein the y vector was being generated with
length m instead of b.
- Corrected function signatures to use type gtint_t instead of gint_t.
- Updated the tests to use conjugate values of type char and convert to
conj_t type only while invoking BLIS tests for both DOTXF and AXPYF.
AMD-Internal: [CPUPL-5117]
Change-Id: I0ef7af429057583a1cbf34827802e72401181caf
Improve consistency in test names across different APIs:
- Improve consistency of TEST_P part of test names.
- Rename *_evt_testing.cpp and nrm2_extreme.cpp files to
*_evt.cpp to match other APIs.
- Standardize naming of IIT_ERS files.
Also:
- Restore trsv IIT_ERS file which was misnamed in commit
a2beef3255
- Tidy ukr gemm tests to be more consistent with each other
and move threshold setting to individual TEST_P functions
to allow different adjustments to be made.
- Similarly make trsm tests more consistent.
- Tidy naming of is_memory_test variable.
AMD-Internal: [CPUPL-4500]
Change-Id: I0af1fc9973b02187b19a7c2488eed1b829cfdc2f
BLIS includes the BLAS and CBLAS interfaces for zdscal
but not the BLIS typed interface bli_zdscalv. Thus, when
TEST_INTERFACE=BLIS_TYPED is defined, disable tests
for zdscal.
AMD-Internal: [CPUPL-4671]
Change-Id: I397454c83e272f9e775e37e00533002576041a93
- Correct value of alpha in ger ERS test.
- rename ERS_IIT.cpp files to match naming convention
used for other APIs.
- Change all cases of gint_t to gtint_t except for
dotxf, which is fixed in another commit.
- Add TEST_UPPERCASE_ARGS to imatcopy and omatcopy{2}
headers.
- Corrected typo.
AMD-Internal: [CPUPL-4500]
Change-Id: I8844bb8c5941785e64daa9df5569092c19f91838
-Adding multiplier for complex APIs.
-Updating for trmv and trsv to reflect multiplication with alpha.
AMD-Internal: [CPUPL-4500]
Change-Id: I17361da5afa5d1e219b4c8a14542e2b216a7ea58
Improve consistency in test names across different APIs.
In this commit, standardize leading dimensions (lda, ldb,
ldc) in test names. Also some misc tidying changes.
AMD-Internal: [CPUPL-4500]
Change-Id: Icbc82d0b9a3420ddfdb4f418396f9e56ab1765ab
Correction to commit 8657e661fc
to allocate matrix or vector correctly when special read-only
case occurs.
Also define a set_matrix generator for symmetric matrices
to only set upper or lower triangle to the supplied value,
while setting the unused elements to a large value to help
catch incorrect access to those elements.
AMD-Internal: [CPUPL-4548]
Change-Id: I22b3a20e2ce8be70eb27179247cd47fdb2d87b9d
Improve consistency in test names across different APIs.
Various changes in this patch:
- Explicitly cast char variables to std::string when
adding to test name. Adding the char directly was
causing errors in name generation.
- Use template version of print function in zdscalv
and remove print function zdscalvGenericTestPrint.
- Remove unused print function ztrsvPrint.
- Eliminate some differences in gemm ukr print
functions.
- Remove extraneous API name labels in ukr axpyf and
setv.
- Make ukr/trsm/test_trsm_ukr.h more consistent with
other files.
AMD-Internal: [CPUPL-4500]
Change-Id: Ib8092de216712586fe4ec0ae91698d0c1aaffd54
Improve consistency in test names across different APIs.
In this commit, standardize storage, side, uplo, trans
diag and conj in test names.
AMD-Internal: [CPUPL-4500]
Change-Id: Ifcdb6e9f684b134841d86087218d7aefd9cabe63
Some BLAS routines do not require matrices or vectors to be
initialized in certain use cases. For example, in GEMM when
beta=zero, C is set rather than updated, thus input values of
C should not be used. In these cases set the inital values of
such matrices or vectors to an extreme value, to help detect
if these are incorrectly being read.
The extreme value can be NaN or Inf. The default is Inf,
change it by running
cmake ... -DEXT_VALUE=NaN
AMD-Internal: [CPUPL-4548]
Change-Id: I4a665363779d2496b8247f6357e970b7f23cd1eb
- Utilized the memory testing feature in GTestsuite
to update the testing interfaces for micro-kernel
testing of SCOPY, DCOPY and ZCOPY APIs.
Change-Id: I3d6905f33b000b8d5e60727aa896bd869f4f441f
- Added accuracy and memory tests for AVX2 and AVX512 ?SETV kernels,
AVX512 ZAXPYV kernel and AVX512 ZAXPYF kernels, with fuse-factors
2, 4 and 8.
- Cleanup of the code-section that declares and defines the reference
compute for AXPYF operation. Corrected the type mismatch with the
arguments that reference AXPYV would expect(this is used to decompose
AXPYF as part of reference). Ensured usage of GTestSuite's internal
alias for integer types.
- Updated the API level testsuite and testing interface for AXPYF,
based on the cleaup done to the reference code.
AMD-Internal: [CPUPL-4974]
Change-Id: I71de6c09d3877cd3dd1eaa20ab4f90e7c33eb1e1
Test programs for key APIs like GEMM take a long time to run,
and even to generate the list of test cases. Break into
separate test programs for different data types to enable
these to run in parallel (at gtest level). In this patch
we break up GEMM, TRSM, GEMV and TRSV.
AMD-Internal: [CPUPL-4500]
Change-Id: I21363b050d30e0402d5a1e8cbeaed2ebcc87aaeb
- Updated DOTV Gtestsuite interface to invoke C/ZDOTC when conjx='c'
and testing interface is either BLAS or CBLAS.
- Added ukr tests for bli_zdotv_zen4_asm_avx512( ... ) and
bli_zdotv_zen_int_avx512( ... ) kernels.
AMD-Internal: [CPUPL-5011]
Change-Id: I32fb69027a35d9ea92f997a095d412c8242a4b68
* Functional tests are covered for saxpyv and zaxpyv.
* As part of functional large size of m, stride greater than m, scalar
combinations(including special cases), Zero increment tests are
added for saxpyv and zaxpyv.
Signed-off-by: eseswari <sangadala.eswari@amd.com>
AMD-Internal: CPUPL-4413
Change-Id: I61473357680cb0f394e6e653796ec31110895fa4
* As part of functional test cases, large size of m, stride greater than
m,scalar combinations, Zero increment tests are added for ?copyv.
Signed-off-by: eseswari <sangadala.eswari@amd.com>
AMD-Internal: CPUPL-4412
Change-Id: I9fa74c147975bbe21263aaf48190170c6ed0a8fd
- Before the system was assuming 3 levels in the directory structure and
was creating corresponding targets.
- Now the system looks into the subdirectories of testsuite and creates
a target for each subdirectory that has at least one cpp file.
- Also deleted a directory that seems duplicate and was breaking builds.
AMD-Internal: [CPUPL-4500]
Change-Id: I03ca362b09783f1c7c5f37ab420d8ca2c2b45e2e
Check internal value of INFO for BLAS2 and BLAS3 routines
using the bli_info_get_info_value() function added in AOCL 4.2.
If testing a BLIS library that does not have this, use
cmake ... -DCAN_TEST_INFO_VALUE=OFF
AMD-Internal: [CPUPL-4993]
Change-Id: Ida5d252b0f6727793ebfb74bb160e8cb96b61b74
- Kernel dimensions are 4x4.
- Two kernels are implemented, Right Upper and
Right lower.
- In case of Left variants of TRSM, transpose is
induced so that Right variant kernels can be used.
- No packing is performed in these kernels.
- Changes are made in the threshold to pick ZTRSM small
code path.
- BLIS_INLINE is removed from signature of
"TRSMSMALL_KER_PROT".
- These kernels do not support "ENABLE_TRSM_PREINVERSION".
- Newly added kernels do not support conjugate
transpose.
- Added multithreading to ZTRSM small code path.
AMD-Internal: [CPUPL-4324]
Change-Id: I683b1d5239593e54f433e7f27497d72dfbd9141c
- Using a template class for the printing operator that depends
on the type.
- USe a macro to denote which interface is being tested.
AMD-Internal: [CPUPL-4500]
Change-Id: I453c4ef4842c354064f49ff32ec4bf42920cc17c
Following a recent change to the data generators to allow a stride
to be specified (60cc23f3d3), seg
faults can occur if m<=0 for column storage or n<=0 for row storage.
Prevent this by having separarate code paths to handle these
scenarios.
AMD-Internal: [CPUPL-4500]
Change-Id: I23ed8b2dccaaca140e2ddfda45bcdb4c888d5708
Improve consistency in test names across different APIs.
In this commit, standardize m, n, k and b in test names.
AMD-Internal: [CPUPL-4500]
Change-Id: I53e7dd83cbf426ab1ebe8aa4af1da01594f4af23
- Updated the IIT_ERS tests for SUBV to avoid using undefined
variables. These tests are enabled only when GTestSuite is
configured for BLIS_TYPED interface testing.
- Updated an instantiator in DAXPBY accuracy tests, to avoid
parsing error(extra comma). These tests are enabled only when
GTestSuite is configured for BLIS_TYPED interface.
AMD-Internal: [CPUPL-4500]
Change-Id: If6894daadbbc353dd66968649642ff07fa663782
First in a series of commits to improve consistency in test names
across different APIs. This will help with gtest filtering.
In this commit, standardize alpha, beta, incx and incy.
AMD-Internal: [CPUPL-4500]
Change-Id: I0cde85f9a4cf969c0b12ac589b232786ad011f09
- Updated test_gemv.h to pass the right boolean
to computediff( ... ), based on whether we run
it for exception value tests or not.
AMD-Internal: [CPUPL-4500]
Change-Id: I1ad2cde4f9b4bb1dadc32d1f7d02a90a457e218f
*covered large sizes, scalar combinations and strides greater than the
size for cger, dger, sger and zger.
Signed-off-by: Sangadala Eswari <Sangadala.Eswari@amd.com>
AMD-Internal: CPUPL-4414
Change-Id: I6fba26a35903d1f6dbd713f19eac6bb537b3d8d2
- Changed the macro guard for accuracy tests of SIMATCOPY,
to ensure that tests are enabled/disabled based on the reference.
- Updated test_gemv.h to make sure the contents of y vector is copied
to y_ref post inducing exception values.
AMD-Internal: [CPUPL-4500]
Change-Id: I7249e643677e7e493eba5d072567615bc913a532
Add name of variable being tested in error output from
computediff functions. First step to adding (optional)
tests on input arguments.
AMD-Internal: [CPUPL-4379]
Change-Id: I9553b660bcf5ecf1dd675cb837655078933455ac
Modify thresholds to reflect number of operations that
accumulate results into each output element. Different
limits are set for early return and special cases.
Constants are still subject to experimentation and change.
AMD-Internal: [CPUPL-4378]
Change-Id: I81f63a36c161ff1866f2d404b9e3cbb9a2948d3a
Modify thresholds to reflect number of operations that
accumulate results into each output element. Different
limits are set for early return and special cases.
Constants are still subject to experimentation and change.
AMD-Internal: [CPUPL-4378]
Change-Id: I03cd8901e574f2e44e85ce8b0bc234e36edb4819
1. Fixed issue related to linking reference library.
2. Clean-up of how reference library variables are set.
2. Compilation error related to std::max() and std::min().
AMD-Internal: [CPUPL-4879]
Change-Id: I427a4a4c0ea56a340a8bbd1a6649252e9680b937
- Added Memory Access Test support for GEMV.
- Added Extreme Value Tests for various combinations of NaN, Inf and
-Inf for ?GEMV.
- Also fixed some invalid IIT_ERS tests.
AMD-Internal: [CPUPL-4825]
Change-Id: Iee77b305f6c6b9427153fbbc5191176dae9fbfea
1. Different matrix sizes
2. Different Stride values and Scalar values
3. Added Early Return tests in new file
Signed-off by: Harish Kumar<harish.kumar@amd.com>
AMD-Internal: [CPUPL-4417]
Change-Id: I5e645612808336e11da0c5ed8da9fe17a5543fbd
Previous commit introduced an infinite recursion problem in
generators for symmetric matrices. This was reported as a
compiler warning by gcc 12.2 but not by gcc 11.4.
AMD-Internal: [CPUPL-4862]
Change-Id: I8642b81a62f0643b5a9ebedb4fcc83b25542de1b
- Added test-cases to verify the functional behaviour
of the BLAS-extension API ?imatcopy_() and ?omatcopy2_().
The test-cases cover the following categories for the
supported datatypes :
- Functional and memory testing.
- Negative parameter testing with invalid inputs.
- Early return scenarios.
- Exception value testing.
- Updated functions in testinghelpers to include strides
in addition to leading-dimension, when initializing
a matrix. The default value for stride is set as 1.
- Implemented functions to load the reference symbol, based
on the choice of the reference library. The function definition
is overloaded due to different API standards being exposed by
different libraries.
- Code cleanup of files for ?OMATCOPY API.
AMD-Internal: [CPUPL-4862]
Change-Id: If63b348f517e2cde1fe48f3a195808b33a91c312
- Added overflow and underflow tests for dgemm
These tests cause floating point overflow and underflow by feeding
values close to DBL_MAX and DBL_MIN values to matrices
DBL_MAX = 1.7976931348623158e+308
DBL_MIN = 2.2250738585072014e-308
When computations result in values beyond the range [DBL_MIN, DBL_MAX],
it leads to an overflow or underflow condition
Two new arguments are added to test_gemm routine - over_under and input_range
over_under = 0 indicates overflow
over_under = 1 indicates underflow
input_range = -1 indicates values within overflow or underflow limits
input_range = 0 indicates values very close to DBL_MIN or DBL_MAX
input_range = 1 indicates values beyond DBL_MIN or DBL_MAX
- New file: dgemm_ovr_undr.cpp
Overflow and underflow tests are called from this file
dgemm_overflow and dgemm_underflow. This file uses
cfloat header file for DBL_MIN and DBL_MAX values
Signed-off-by: Nimmy Krishnan <nimmy.krishnan@amd.com>
AMD-Internal: [CPUPL-4492]
Change-Id: I4bbd519abacc56f322c73d6c0187ed6e1abbbf2b
- Added test-cases to verify the functional behaviour
of the BLAS-extension API ?omatcopy_(). The test-cases
cover the following categories for the supported datatypes :
- Functional and memory testing.
- Negative parameter testing with invalid inputs.
- Early return scenarios.
- Exception value testing.
- Implemented a function to load the reference symbol, based
on the choice of the reference library. The function definition
is overloaded due to different API standards being exposed by
different libraries.
AMD-Internal: [CPUPL-4810][SWLCSG-2706]
Change-Id: I8dcaeeaa36d392b752eb0685e32583a12ddc4220