- Updated the generate_NAN_INF() in test_trsm.h to properly induce NaNs
and Infs for complex types.
AMD-Internal: [CPUPL-4639]
Change-Id: I4226e5c5b5f7de85eb89271551f897f87755f4f5
- Handle -0.0 separately in get_value_string()
- Avoid unused variable warning when not TEST_BLIS_TYPED in
subv_evt_testing.cpp
- Remove unused variables in dgemm_ukernel.cpp
- Remove unnecessary local copies of greenzone1 in test
programs now that greenzone_1 and greenzone_2 will
not overlap.
- Protect tests of haswell kernels by ifdef on
BLIS_KERNELS_HASWELL rather than BLIS_KERNELS_ZEN.
- Added GTEST_ALLOW_UNINSTANTIATED_PARAMETERIZED_TEST
statements in TRSM kernel tests.
- Correct descriptions of trsm and trmm operations.
- Correct typos.
AMD-Internal: [CPUPL-4500]
Change-Id: If8520347e417785e6aa953a0c8a65d4f5f3c1591
- Added API tests
- Added Invalid input test cases (IIT).
- Added memory testing for SWAPV API.
- Added micro kernel testing for single and double precision
- Added reference swapv functionality in testinghelpers
- Added binary comparison method for two vectors with different
increments in check_error.h
AMD-Internal: [CPUPL-4814]
Change-Id: I32bcca51b4e998d51ede70869035da76a7f6dbca
> 1. Ranges with small and average sizes
> 2. Values with different orders
> 3. Negative tests added with negative range of values and stride values.
> 4. Added Early Return tests.
> Signed-off by: Harish Kumar<harish.kumar@amd.com>
AMD-Internal: [CPUPL-4419]
Change-Id: Iaadc0f3104c237d3fb6ccf2c2b398b30edcd1ee4
- Added unit-test cases for accuracy and memory-testing of the
following kernels :
- bli_samaxv_zen_int( ... ) and bli_samaxv_zen_int_avx512( ... )
- bli_damaxv_zen_int( ... ) and bli_damaxv_zen_int_avx512( ... )
- Added test cases to verify the compliance of SAMAXV and DAMAXV
APIs, through Exception Value Testing(EVT). This is done by
inducing exception values in the input vector(at two places).
The induction is controlled by the user, through indices given as
part of the parameterized test-cases.
AMD-Internal: [CPUPL-4660][CPUPL-4661]
Change-Id: I25b7d54487fa9fb6a30ac13563d1497af8b582ab
- Added API tests for DGEMMT.
- Added Extreme Value Test cases (EVT) for DGEMMT.
- Tests for various combinations of INFs
and NANs for A and B matrix are added.
- Added Invalid input test cases (IIT).
- Added memory testing for DGEMMT API.
AMD-Internal: [CPUPL-4724]
Change-Id: Ib40802ea49417b4a4883831c2d971e59a2e093e5
Description:
1. Replaced aligned load intrinsics _mm512_load_ps
with unaligned load intrinsics _mm512_loadu_ps.
2. There is no guarantee that the memory address
can be aligned everywhere. The changes are under
beta multiplication. Copy paste error.
Change-Id: I978231b556e17ad7e66c5028ed1cd904c653e0a8
- Added API tests for ZTRSV.
- Added Extreme Value Test cases (EVT) for ZTRSV.
- Tests for various combinations of INFs
and NANs for X vector and B matrix are added.
- Added memory testing for ZTRSV
API.
AMD-Internal: [CPUPL-4716]
Change-Id: I0291acaafa78073979c307a4cc9595d429229c0c
- Added API tests for DTRSV.
- Added Extreme Value Test cases (EVT) for DTRSV.
- Tests for various combinations of INFs
and NANs for X vector and B matrix are added.
- Added Invalid input test cases (IIT).
- Added memory testing for DTRSV
kernels.
- Fixed a bug in alphax function where scaling
of a vector with a scalar was not handled
correctly when incx was negative.
AMD-Internal: [CPUPL-4715]
Change-Id: I84c873e98f845e05b11860e7ef6083d1184489b4
- Updating gemm/cgemm_ukernel.cpp to cast integers so that gtestsuite works for ILP64.
- Updating BLIS cmake presets to be conditional on Windows and Linux.
- Updating GTestSuite cmake system to use environment variable to set BLIS_PATH and reference library.
- Add more cmake presets options in gtestsuite.
- Updating printing functionality for vectors and matrices.
- Adding macro definition checks so that GTestSuite builds successfully for shared libraries on zen3.
- Casting integers so that code builds for ILP64.
AMD-Internal: [CPUPL-4500]
Change-Id: I03afd08d5ad8ae50193d9559cf4ab8fc1d08753c
Modifications to testinghelpers::get_value_string() to allow
floating point values (e.g. for alpha and beta) to be used in
generating test names. Values will be generated in the form
1p3 or m2p4, or 3p0_4p5i for complex data. One decimal place
is currently enabled but this can be increased if needed. This
helps prevent duplicate test name errors when the list of values
for alpha or beta includes e.g. 1.0 and 1.3.
Also add support in testinghelpers::get_value_string() for
variables of type gtint_t.
AMD-Internal: [CPUPL-4500]
Change-Id: Icc8ca3c3cfacd7d46fffefee5a6e05452f704d4e
- Added BUILD_STATIC_LIBS option which is on by default, only on Linux.
- Added TEST_WITH_SHARED option which is off by default, only on Linux.
- If only shared or static lib is being built, that's the one that will be used for testing.
- If both are being built, TEST_WITH_SHARED determins which library wil be used for testing.
- Set linux workflows so that they build both static and shared libs, and use linux-static and linux-shared to denote which one should be used for testing.
- Set -fPIC for both static and shared builds to fix issues faced when building blis using AOCC 4.0.0 and gtestsuite using gcc 9.4.0.
AMD-Internal: [CPUPL-2748]
Change-Id: I4227bab97ff31ecddfe218e18499f33b4e4ee63e
- Added Functional Tests, Early Return Scenarios, Invalid Input Tests
and Extreme Value Tests for S/D/C/ZGER.
- EVTs are added for the sake of sanity since GER is primarily utilizing
the AXPYV kernel.
AMD-Internal: [CPUPL-4758]
Change-Id: I12db0ba952eeb97ab167656ab5fd614e56437154
CMakelists.txt is updated to support aocl_gemm on windows.
On windows, BLIS library(blis+aocl_gemm) is built successfully
only with AOCC Compiler. (Clang has an issue with optimizing
VNNI instructions).
$cmake .. -DENABLE_ADDON="aocl_gemm" ....
AMD-Internal: [CPUPL-2748]
Change-Id: I9620878ab6934233fadc9ddc5d5e82ad85be9209
Updated compiler id in cmake related files from
CMAKE_CXX_COMPILER_ID to CMAKE_C_COMPILER_ID
AMD-Internal: [CPUPL-2748]
Change-Id: Ib0e2a2e3ec8fafeb423fe56b9842a93db0115371
CGEMM:
API: Functional testing of CGEMM
Covers different matrix sizes
Hence it covers SUP and Native code path
EVT: Insertion of Exception values like NAN, +/-INF in Matrix
EV is inserted in user provided indices of in/out Matrices
EV is passed as alpha and beta values
Expectation is output should be complaint with standard output
MEM: To check for out of bound read or write through protected pages
ZGEMM:
- Updated EVT tests for special case for alpha, beta when
imaginary component is 0
- Updated SUP & Native method to support C/Z datatype
AMD-Internal: [CPUPL-4712]
Change-Id: If8ba99998e0a494375a764bb7756d45147388965
- dotxf is a blis specific kernel, which performs dotxv
operation but in multiple of fused factors to speed up
the operations.
- So dotxf reference function is implemented for gtestsuite,
where dotxf computation compared against computation done by
looping over dotxv function.
AMD-Internal: [CPUPL-4764]
Change-Id: I342dab066ceb1710649e54bb73afc5a23e2a8177
- Testcases with exception values such as nan and +/-inf.
- Randomly inserting nan, +/- inf in A,B or C matrix along with
alpha and beta with extreme values
AMD-Internal: [CPUPL-4681]
Change-Id: Ia92bcdb4519e9a0e4c6026e93b5e2e2f0e19b065
- axpyf is a blis specific kernel, which performs axpy
operation but in multiple of fused factors to speed up
the operations.
- So axpyf reference function is implemented for gtestsuite,
where axpyf computation compared against computation done by
looping over axpy function.
AMD-Internal: [CPUPL-4763]
Change-Id: I4713fd0b0d9e9cf688c9aaa82ac0e6ae07a05989
Details:
- Added new folder named JIT/ under addon/aocl_gemm/. This folder
will contain all the JIT related code.
- Modified lpgemm_cntx_init code to generate main and fringe kernels
for 6x64 bf16 microkernel and store function pointers to all the
generated kernels in a global function pointer array. This happens
only when gcc version is < 11.2
- When gcc version < 11.2, microkernel uses JIT-generated kernels.
otherwise, microkernel uses the intrinsics based implementation.
AMD-Internal: [SWLCSG-2622]
Change-Id: I16256c797b2546a8cd2049680001947346260461
- Added unit-test cases for verifying the accuracy of
bli_zaxpbyv_zen_int( ... ) kernel.
- The test cases cover the necessary range of values for the sizes
and the scaling factors(alpha and beta), to ensure code-coverage
and check for compliance with the standard.
- Added memory tests for these kernels, to check for
out-of-bounds reads/writes.
- Further updated the test-cases for exception value testing(EVT)
of ZAXPBY API. These test-cases verify the compliance against the
standard and help in determining whether the exception value has to
be propagated, or handled seperately.
AMD-Internal: [CPUPL-4698]
Change-Id: If3c470c051f94393be3a1d444ed424f626ae6f5f
- Updated SCALV test template to handle mixed-precision datatypes.
- These tests explicitly induce NaNs and (+/-)Infs in the input vector
to verify the handling or propagation of NaNs and Infs according to
the compliance.
AMD-Internal: [CPUPL-4710]
Change-Id: Iab4b671677542f1137631060dc0592086acf874c
- Utilized the memory testing feature in gtestsuite to add memory tests
for D/Z/ZDSCALV kernels.
- Updated the test fixtures, loggers and instantiators to use the new
testing interface for memory testing.
AMD-Internal: [CPUPL-4700]
Change-Id: I13cad2271198423e7b0d361f6a5cccdc8b401183
- Utilized the memory testing feature in gtestsuite to add memory tests
for DDOTV micro-kernels.
- Updated the test fixtures, loggers and instantiators to use the new
testing interface for memory testing.
- Use --gtest_filter="*mem_test_disabled*" to disable memory tests or
--gtest_filter="*mem_test_enabled" to run only memory tests.
AMD-Internal: [CPUPL-4406]
Change-Id: I887a89f33ca43e504479702263b6c66ddd7937de
- Updated the existing benchmarking file for SCALV API, to include
support to call the BLAS and CBLAS mixed-precision SCALV, namely
cblas_csscalv(), csscalv_(), cblas_zdscalv(), zdscalv_().
- The input is expected to be given with the datatype 'ZD' and 'CS'
in order to benchmark the associated mixed-precision APIs.
AMD-Internal: [CPUPL-4722]
Change-Id: I4ab0fb19fe1949468cf707d0a857e8a1681addeb
Description
1. when mr0=1 case the accumulator register and operand
registers for an fma instruction got swapped. Corrected
the copy paste error.
2. Removed fill array for c_ref in bench_lpgemm.c and used
memcpy from c buf, because fill array now using rand()
function to initialize data which can be different
when c_ref and c called separately, this was working
because data was fixed (i=0 ... i%5).
Change-Id: Ia513331ba49d28adc7bcdc0ec78d443abe66780b
- Added test cases to verify the compliance of ?SUBV APIs,
through Exception Value Testing(EVT). This is done by
inducing exception values in the input operands. The induction
is controlled by the user, through indices given as part of the
parameterized test-cases.
- Various combinations of zeros, NaNs and +/-Infs have been used to
verify the compliance against the standard.
Change-Id: If7ce582f2d0ab92acaf02215126f6e4caff3af8d
CMakelists.txt is updated to support ASAN to find
memory related errors in blis library. ASAN is enabled
by configuring cmake with the following option .
$ cmake .. -DENABLE_ASAN=ON
ASAN supports only on linux with clang compiler.
And redzone size default size is 16 bytes and maximum
redzone size is 2048 bytes.
$ ASAN_OPTIONS=redzone=2048 <exe>
AMD-Internal: [CPUPL-2748]
Change-Id: I0b70af5c41cf5c68602150daeb67d7432bbe5cb8
- Updated existing ERS and IIT test framework in SCALV to handle mixed
precision types (CSSCAL/ZDSCAL).
AMD-Internal: [CPUPL-4673]
Change-Id: I72399675e4e5b8a3e16d81d747db73a3c88ce1ef
- Added micro-kernel and API level tests for avx512 and avx2 small, sup
and native SGEMM kernels for various value of storage,
M, N, K, alpha, beta
- Added memory testing for sgemm kernels
AMD-Internal: [CPUPL-4681]
Change-Id: I72f94960e7c497ae75da872412eee69c23637348
1. The 5 LOOP LPGEMM path is in-efficient when A or B is a vector
(i.e, m == 1 or n == 1).
2. An efficient implementation of lpgemv_rowvar_f32 is developed
considering the b matrix reorder in case of m=1 and post-ops fusion.
3. When m = 1 the algorithm divide the GEMM workload in n dimension
intelligently at a granularity of NR. Each thread work on A:1xk
B:kx(>=NR) and produce C=1x(>NR). K is unrolled by 4 along with
remainder loop.
4. When n = 1 the algorithm divide the GEMM workload in m dimension
intelligently at a granularity of MR. Each thread work on A:(>=MR)xk
B:kx1 and produce C = (>=MR)x1. When n=1 reordering of B is avoided
to efficiently process in n one kernel.
5. Fixed few warnings while loading 2 f32 bias elements using
_mm_load_sd using float pointer. Typecasted to (const double *)
AMD-Internal: [SWLCSG-2391, SWLCSG-2353]
Change-Id: If1d0b8d59e0278f5f16b499de1d629e63da5b599
- Added unit-test cases for the following AVX2 kernels:
- bli_snorm2fv_unb_var1_avx2( ... )
- bli_scnorm2fv_unb_var1_avx2( ... )
- bli_dnorm2fv_unb_var1_avx2( ... )
- bli_dznorm2fv_unb_var1_avx2( ... )
- Defined a templatized testing interface and function-pointer
type. This is used as part of the test-fixture class and
testsuite definitions, when writing the unit tests.
- The test cases cover the necessary range of values for the sizes
to ensure code-coverage in the kernels.
- Further added memory tests for these kernels, to check for
out-of-bounds reads/writes.
AMD-Internal: [CPUPL-4637]
Change-Id: I747ab104b947e87b5f8eda597256b7b8b6f7c2f2
- Added API tests for [C\Z]TRSM.
- Added Extreme Value Test cases (EVT) for [C\Z]TRSM.
- Tests for various combinations of INFs
and NANs in A and B matrix are added.
- Added Invalid input test cases (IIT).
- Added micro kernel testing for ZTRSM
- Added unit tests for small and native
path kernels.
- Added memory testing for ZTRSM
kernels.
AMD-Internal: [CPUPL-4641]
Change-Id: I0db6b2c75b59821e1cde33532fb13400fab43412
- Added API tests for STRSM.
- Added Extreme Value Test cases (EVT) for STRSM.
- Tests for various combinations of (+/-) INFs
and NANs in A and B matrix are added.
- Added micro kernel testing
- Added unit tests for small and native
path kernels.
- Added memory testing for STRSM
kernels.
- Edited the protected buffer in memory testing to
make sure that greenzone1 and greenzone2 do not
intersect.
AMD-Internal: [CPUPL-4640]
Change-Id: Ic48590d3b4ad12c4f2f6beaec2e1106a7aaa5213
While build blis library using ninja generator on windows, observed
ninja is randomly adding "|| '(set', 'FAIL_LINE=3&', 'goto', ':ABORT)'"
as extra arguments for add_custom_command. Due to this flatten-headers
python script was failing to create blis.h and cblas.h headers.
Modified the python script to fix above issue.
AMD-Internal: [CPUPL-2748]
Change-Id: I83b753d08e46f94b282176fcc661ce34e5eee3cf
- Updated test_scalv and ref_scalv templates for SCALV gtestsuite to
support unit-tests for mixed precision SCALV.
- Added unit-tests for the following kernels:
ZSCALV
- bli_zscalv_zen_int( ... )
ZDSCALV
- bli_zdscalv_zen_int10( ... )
- bli_zdscalv_zen_int_avx512( ... )
- Also, added API level unit-tests for the following cases:
- Unit Positive Increments
- Non-Unit Positive Increments
- Updated comments in DSCALV unit-tests with the correct kernel name.
AMD-Internal: [CPUPL-4624]
Change-Id: I96db8d3612687be07cd0e638a3119d41c3641ce8
- Added test cases to verify the compliance of SAXPY and ZAXPY
APIs, through Exception Value Testing(EVT). This is done by
inducing exception values in the input operands. The induction
is controlled by the user, through indices given as part of the
parameterized test-cases.
- Various combinations of zeros, NaNs and +/-Infs have been used to
verify the compliance against the standard. These combinations
help in determining whether the exception value has to be
propagated, or handled seperately.
- Updated the comments, class names and test-case loggers for
uniformity.
- Added special cases of alpha and beta values to API level
functionality tests, to check for any possible framework
level optimizations against the standard.
AMD-Internal: [CPUPL-4655]
Change-Id: I3d817d44c6d239cbc61d146583707b3c8338de29
Modify thresholds to reflect number of operations that
accumulate results into each output element. Different
limits are set for early return and special cases.
Constants are still subject to experimentation and change.
AMD-Internal: [CPUPL-4378]
Change-Id: Ic4540a2f1f6cd6380228b6a2884ac62850d6d8c6
- Testing out of bound read and write of input and output matrix
for SUP and Native micro kernels
- Protected buffers and memory testing feature available in gtestuite
is used to validate memory error
AMD_Internal: [CPUPL-4623]
Change-Id: I620fd3cd4eed1002e08b6233effb89b47beb073f
- Added unit-test cases for bli_zaxpyv_zen_int5( ... ),
bli_saxpyv_zen_int10( ... ) and bli_saxpyv_zen_int_avx512( ... )
kernels.
- The test cases cover the necessary range of values for the sizes
and the scaling factor(alpha), to ensure code-coverage and check
for compliance with the standard.
- Further added memory tests for these kernels, to check for
out-of-bounds reads/writes.
AMD-Internal: [CPUPL-4629]
Change-Id: If5e626ca2d0270e34dc2d951ae5c81f839a78ef0
For gcc greater than or equal to 7.0 version added AVX512 compiler flags
in makde_defs.mk and make_defs.cmake. AVX512VNNI compiler flag is only
supported from gcc version 8 or greater. So added another else condition
for gcc version greater than or equal to 7 - enabling avx512 flags.
This enables compilation of AVX512 assembly code paths with gcc 7.5 version.
Change-Id: I2cda00e578010db5e5a515b506c0b99f685307e0
- These tests explicitly include NaNs and (+/-)Infs in the input vector
to verify the handling or propagation of NaNs and Infs according to
the compliance.
AMD-Internal: [CPUPL-4406]
Change-Id: I3063805eb3fdfd58be3168b24cdb97de2c175c3c
- Utilized the memory testing feature in GTestsuite
to update the testing interfaces for micro-kernel
testing of DAXPY, DAXPBY and DCOPY APIs.
- The interface allocates memory using objects of
ProtectedBuffer class, which define the redzones
and greenzones as per the requirement.
- Updated the test fixture classes, test-case loggers and
the instantiators to use the new testing interface for
memory testing.
- Added special cases of alpha and beta values to API
level functionality tests, to check for any possible
framework level optimizations against the standard.
- Code cleanup of ?_generic.cpp and ?_evt_testing.cpp
files of DAXPY, DAXPBY and DCOPY APIs.
AMD-Internal: [CPUPL-4402]
Change-Id: Id945cabbbb42604d76a9e34269bff0f9f6712604