Commit Graph

31 Commits

Author SHA1 Message Date
Edward Smyth
9500cbee63 Code cleanup: spelling corrections
Corrections for some spelling mistakes in comments.

AMD-Internal: [CPUPL-3519]
Change-Id: I9a82518cde6476bc77fc3861a4b9f8729c6380ba
2023-11-09 00:16:30 -05:00
Arnav Sharma
dd1cf23090 Gtestsuite Update for Pack and Compute Extension APIs
- Pack and compute are now compared against GEMM operation of reference
  library when MKL is not used as a reference.
- For the case where both A and B are unpacked, the reference GEMM is
  invoked with a unit-alpha scalar.
- If MKL is used as reference, then these APIs are compared against pack
  and compute operations of MKL.
- Updated description in ref_gemm_compute.cpp to reflect this behavior.

AMD-Internal: [CPUPL-4084]
Change-Id: Id0521c9cad8743a7ae471a7f3c547ceb67191f86
2023-11-03 09:45:42 -04:00
Arnav Sharma
44dfc7a515 Fix for gemm_compute BLAS Check
- BLAS compute checks updated to properly check for rs_c and cs_c.
- Updated BLAS compute checks to skip validity check if m==1 or n==1.
  For the same reason, added a check just before to validate rs_c and
  cs_c are greater than or equal to 1.
- Added tiny size tests to gtestsuite as a sanity check.
- Also updated the Invalid Input Tests to test for the updated checks.

AMD-Internal: [CPUPL-4140]
Change-Id: I984339ec7909778b58409ffcdbeed4ee33f28cfb
2023-11-03 09:41:16 -04:00
Vignesh Balasubramanian
84faccdd7d Enabling the vectorized path for SNRM2_
- Enabled the vectorized AVX-2 code-path for SNRM2_. The
  framework queries the architecture ID and calls the
  vectorized kernel based on the architecture support.

- In case of not having the architecture support, we use
  the default path based on the sumsqv method.

AMD-Internal: [CPUPL-3277]
Change-Id: Ic60c0782dec0b7eb09fac21818eb625e57b1d14f
2023-11-03 17:45:56 +05:30
Arnav Sharma
c1612f6838 Gtestsuite Framework and Unit Tests for Pack and Compute Extension APIs
- Added framework for unit testing of BLAS and CBLAS interfaces for the
  Pack and Compute Extension APIs.
- These test the integrated functionality of the trio of
  ?gemm_pack_get_size(), ?gemm_pack() and ?gemm_compute() APIs.
- Note: Only MKL can be used as reference for now.

AMD-Internal: [CPUPL-3560]
Change-Id: I801654447a716da06c9ccf9db01d553817871571
2023-10-16 09:35:42 -04:00
Vignesh Balasubramanian
81161066e5 Multithreading the DNRM2 and DZNRM2 API
- Updated the bli_dnormfv_unb_var1( ... ) and
  bli_znormfv_unb_var1( ... ) function to support
  multithreaded calls to the respective computational
  kernels, if and when the OpenMP support is enabled.

- Added the logic to distribute the job among the threads such
  that only one thread has to deal with fringe case(if required).
  The remaining threads will execute only the AVX-2 code section
  of the computational kernel.

- Added reduction logic post parallel region, to handle overflow
  and/or underflow conditions as per the mandate. The reduction
  for both the APIs involve calling the vectorized kernel of
  dnormfv operation.

- Added changes to the kernel to have the scaling factors and
  thresholds prebroadcasted onto the registers, instead of
  broadcasting every time on a need basis.

- Non-unit stride cases are packed to be redirected to the
  vectorized implementation. In case the packing fails, the
  input is handled by the fringe case loop in the kernel.

- Added the SSE implementation in bli_dnorm2fv_unb_var1_avx2( ... )
  and bli_dznorm2fv_unb_var1_avx2( ... ) kernels, to handle fringe
  cases of size = 2 ( and ) size = 1 or non-unit strides respectively.

AMD-Internal: [CPUPL-3916][CPUPL-3633]
Change-Id: Ib9131568d4c048b7e5f2b82526145622a5e8f93d
2023-10-16 07:26:27 -04:00
Harsh Dave
7a4f84fbac Optimized dgemm for tiny input sizes.
- This commit focused on enhancing the performance of dgemm
for matrices for very small dimenstions.

- blis_dgemm_tiny function re-uses dgemm sup kernels, bypassing
the conventional SUP framework code path. As SUP framework code path
requires the creation and initilization of blis objects,
accessing all the needed meta-information from objects, querying contexts
which adds performance penaulty while computing for matrices with  very
small dimensions.

- To avoid such performance penaulty blis_dgemm_tiny function implements
a lightweight support code so that it can re-use dgemm SUP kernels such a way
that it directly operates on input buffers. It avoids framework overhead of
creating and intializing blis objects, context intialization, accessing other
large framework data structures.

- blis_dgemm_tiny function checks for threshold condition to match before
picking the kernel. For zen, zen2, zen3 architecture tiny kernel is invoked
for any shape as long as m < 8 and k <= 1500 or m < 1000 and n <= 24 and k <=1500.
While for zen4 as long as dimensions are less than 1500 for m,n,k tiny kernel is
invoked.

-blis_dgemm_tiny function supports single threaded computation as of now.

AMD-Internal: [CPUPL-3574]
Change-Id: Ife66d35b51add4fccbeebd29911e0c957e59a05f
2023-10-16 05:52:49 -04:00
Vignesh Balasubramanian
a6a67fea2d ZAXPBYV optimizations for handling unit and non-unit strides
- Updated the bli_zaxpbyv_zen_int( ... ) kernel's computational
  logic. The kernel performs two different sets of compute based
  on the value of alpha, for both unit and non-unit strides. There
  are no constraints on beta scaling of the 'y' vector.

- Updated the logic to support 'x' conjugate in the computation.
  The kernel supports conjugate/no conjugate operation through the
  usage of _mm256_fmsubadd_pd( ... ) and _mm256_addsub_pd( ... )
  intrinsics.

- Updated the early return condition in the kernel to adhere to
  the standard compliance.

- Updated the scalar computation with vector computation(using 128
  bit registers), in case of dealing with a single element(fringe case)
  in unit-stride or vectors with non-unit strides. A single dcomplex
  element occupies 128 bits in memory, thereby providing scope for
  this optimization.

- Added accuracy and extreme value testing with sufficient sizes
  and initializations, to test the required main and fringe cases
  of the computation.

AMD-Internal: [CPUPL-3623]
Change-Id: I7ae918856e7aba49424162290f3e3d592c244826
2023-10-12 06:31:08 -04:00
jagar
5d578684ea GtestSuite: Update in source code to make it compatible on MSVC(windows)
AMD-Internal: [CPUPL-2732]
Change-Id: Ifd9372bf9b0f00c2bf24442ea8519bfcf4e5db5b
2023-10-09 04:43:29 -04:00
jagar
712a84d50f Gtestsuite: Update in cmake to search reflib in given path
AMD-Internal: [CPUPL-2732]
Change-Id: Ide2b98a95f81f394c7c01cc3a3b5ae6fa0403a82
2023-10-05 05:39:27 -04:00
jagar
29711dd5a3 Gtestsuite: Updated testings_basics.* to print matrix/vector name
AMD-Internal: [CPUPL-2732]
Change-Id: I89b4ffc97ea852e66f42b82058af67c16144fbf6
2023-09-26 08:27:19 -04:00
Vignesh Balasubramanian
32104c400c GTestSuite : Designing test cases for ZGEMM
- Designed test cases for unit testing of ZGEMM compute
  kernel for handling inputs when k == 1. The design
  uses value-parameterized testing for checking accuracy,
  and verifying the mandate in case of exception values
  on the inputs/output.

- The design uses type-parameterized testing for verifying
  BLAS standard for invalid input cases, and also for early
  return scenarios.

- Added the function template set_ev_mat( ... ) as part of
  testinghelpers. This function is used as a helper for
  inducing exception values onto indices specified as
  arguments to the test_gemm( ... ) interface.

- Abstracted the function definition of getValueString( ... )
  from the NRM2 testing interface to testinghelpers(renamed
  as get_value_string( ... ) for naming consistency), in order
  to use it as a helper function across all APIs in case of
  exception value testing.

AMD-Internal: [CPUPL-3823]
Change-Id: I0fea21f9c8759bbbdc88ba0a016202753e28f2a7
2023-09-08 17:36:57 +05:30
Eleni Vlachopoulou
a6641dec0b Updating GTestSuite CMake system to enable testing BLIS libraries on Windows.
- Renaming ELEMENT_TYPE to BLIS_ELEMENT_TYPE, since the first is defined on a Windows header.
- Updating refCBLAS object to have different implementation depending on the platform.
- Removing dlfcn.h from all reference headers since it's linux specific and adding it conditionally on a higher level.
- Changes on all CMakeLists.txt files to enable building on Windows.

AMD-Internal: [CPUPL-2732]
Change-Id: I6e35656a3779b35dc815a2409cf84c22dd27f3e7
2023-08-29 16:11:22 +05:30
Eleni Vlachopoulou
fa77d0415a Updating nrm2 GTestSuite testing
- Adding default template parameter for the type of the returned value from nrm2.
- Bugfix on NaN/Inf comparator for scalars.
- Tuning sizes of vector x to exercise the different paths for vectorized and scalar code.
- Adding wrong parameters and extreme value testing.
- Adding tests for overflow and underflow using max and min representable numbers for vectorized and scalar code.

AMD-Internal: [CPUPL-2732]
Change-Id: Ice8ee65095ecaa7b30ebd5f90ed2a890178533db
2023-07-28 05:03:00 -04:00
jagar
fb6f1380b2 Gtestsuite:Added util functions
- Functions to print matrix and vector elements.
- Functions to convert matrix to symmetric, hermitian
  triangular matrix and set diagonal elements in matrix.

AMD-Internal: [CPUPL-2732]
Change-Id: I1ffa5289329cbb8a9581bf545bdd157801cf5baa
2023-06-27 16:33:57 +05:30
jagar
003d1e9ae6 GTestSuite: Using ELEMENT_TYPE to specify generation of random numbers in tests.
Since random numbers are specified from ELEMENT_TYPE and we never generate tests for both integer and floating point numbers at the same time, we update code as described below:
- random vector/matrix generators are updated to use ELEMENT_TYPE as a default parameter.
- ::testing::Values(ELEMENT_TYPE) is removed from all test generators.

AMD-Internal: [CPUPL-2732]
Change-Id: Ibc6b05044502f541c9e8a7687931b1ca2903fb0c
2023-06-21 11:30:15 -04:00
Edward Smyth
7e50ba669b Code cleanup: No newline at end of file
Some text files were missing a newline at the end of the file.
One has been added.

Also correct file format of windows/tests/inputs.yaml, which
was missed in commit 0f0277e104

AMD-Internal: [CPUPL-2870]
Change-Id: Icb83a4a27033dc0ff325cb84a1cf399e953ec549
2023-04-21 10:02:48 -04:00
Edward Smyth
0f0277e104 Code cleanup: dos2unix file conversion
Source and other files in some directories were a mixture of
Unix and DOS file formats. Convert all relevant files to Unix
format for consistency. Some Windows-specific files remain in
DOS format.

AMD-Internal: [CPUPL-2870]
Change-Id: Ic9a0fddb2dba6dc8bcf0ad9b3cc93774a46caeeb
2023-04-21 08:41:16 -04:00
Eleni Vlachopoulou
ea484f38e6 BLIS GTestSuite fixes for ILP64.
- Adding doc regarding option setting for INT64 in README.
- Bugfix on template instantiation on helper function. Updated to use gtint_t instead of int.

AMD-Internal: [CPUPL-2732]
Change-Id: Ia52407a1ef3fdd06e905c2e3d4aa5befb80e82d6
2023-04-19 03:41:55 -04:00
jagar
a77402968c GTestsuite: Updates in CmakeLists.txt to check libraries
Updated the CmakeLists.txt to check whether the specified
libraries are present or abort cmake building

AMD-Internal: [CPUPL-2732]
Change-Id: I90115217c228430095aa53a82dc26d16935b320f
2023-04-14 08:56:41 -04:00
jagar
f164c7fe70 Added GTestSuite helper functions
- Functions to convert to cblas enums from char.
- Functions to print matrix and vector elements.
- Functions to set matrix and vector elements with
  the given value.

AMD-Internal: [CPUPL-2732]
Change-Id: I1046b9578c8456e89eddba4a4e8577016b9361ca
2023-04-12 09:03:08 -04:00
Eleni Vlachopoulou
e8392fedb8 GTestSuite fix on trsm tests.
- Fixing thresholds to be more appropriate.
- Updating the way random entries of A and B are generated so that A is diagonally dominant and the algorithm doesn't diverge.

AMD-Internal: [CPUPL-2732]
Change-Id: I6d5691d744ecc623f66c45e94461bd88625d7179
2023-04-11 20:01:21 +05:30
jagar
1d5c1e5803 Code coverage support in gtestsuite framework
- Tools used for code coverage are : Gcov and Lcov.
- We need to use macros specified by gcov during
  compiliation of blis and gtestsuite.
- Locv will generate coverage reports in html format.

AMD-Internal: [CPUPL-2732]
Change-Id: I17b30b4a322b8771f2d6a4ba28986cf0ccf3fba6
2023-04-10 07:48:15 -04:00
Eleni Vlachopoulou
fa024b82ad Adding helper functionality for wrong input testing in GTestSuite.
- Added a header with correct default values to be used in tests.
- Updated README to include information on how to test for wrong parameters and some explanation on how lda increments work.

AMD-Internal: [CPUPL-2732]
Change-Id: I4f540d46013ffe91b4acb30da2b437251c09d3bc
2023-04-06 13:32:29 -04:00
Eleni Vlachopoulou
bf3f5cafa8 BLIS GTestSuite Updates:
- Fix in README.md.
- Updating abs overload for scomplex and dcomplex to avoid overflow by using std::abs.
- Updating comparators to take into account NaNs and Infs when measuring error.

AMD-Internal: [CPUPL-2732]
Change-Id: I8c12bacd9d63b2e914d0a79f337f7525dc16b733
2023-04-05 06:11:34 -05:00
jagar
f9adfa8ee4 Updated CmakeLists.txt to remove cmake generated files
cmake generated files and executables are cleaned within
build directory by "make distclean" command.

Change-Id: I4fd5193e92958122ff10ecc634b42096f3b3716e
2023-04-05 06:11:16 -04:00
Eleni Vlachopoulou
58f85bb8f1 Adding copyright notice in gtestsuite files.
Change-Id: I5097831eb7a46c56a4a2a32da4d3ee69c8b36cb5
2023-03-29 09:01:48 -04:00
Eleni Vlachopoulou
04e091fdca BLIS GTestSuite: Link OpenMP if we test serial BLIS, but MKL is used as a reference.
Change-Id: Iacafa5ecf74622fa5e1180a81305cf7a23d79055
2023-03-28 04:43:58 -04:00
Eleni Vlachopoulou
155a64e734 Introducing upgrated BLIS GTestSuite.
Key features:
- able to test both static and dynamic libraries
- able to test BLAS, CBLAS and BLIS-typed interface
- can use any CBLAS library for reference results
- can build and/or run tests depending on the BLAS level or a specific API

AMD-Internal: [CPUPL-2732]
Change-Id: Ibe0d7938e06081526bbc54d3182ac7d17affdaf6
2023-03-21 03:17:51 +05:30
Eleni Vlachopoulou
88e549e7bd Using CMake as the build system for both Linux and Windows:
- GoogleTest headers removed. GoogleTest gets fetched at
configuration time.
- BLIS headers removed. A BLIS installation path is required at
configuration time.
- Windows has been temporarily disabled.

AMD-Internal: [CPUPL-2732]
Change-Id: I9e55c8e43b2733f96cd8b6e5449d79623decad5c
2022-12-13 19:09:23 +05:30
jagar
cff29bde76 Added gtestsuite folder into blis repo
Moved blis gtestsuite from lib-confscript to blis repo
(branch: amd-main)

Change-Id: If7ad391eef66bac6d26cf5223e6043d52b746072
2022-12-07 23:57:13 -05:00