1. Added support in cmake scripts for linking libomp for blis multithreading build.
2. Added ${CMAKE_CURRENT_SOURCE_DIR}/bli_axpyf_zen_int_6.c statement in blis\kernels\zen\1f cmake file to build newly added file.
3. Added the new macros in blis/frame/include/bli_macro_defs.h for ENABLE_NO_UNDERSCORE_API support for gemm_batch and axpby API's.
4. Modified the file open mode from binary to text mode in blis/testsuite/src/test_libblis.c file to avoid the line ending issue on different OS.
5. Added the definition for the macro BLIS_DISABLE_TRSM_PREINVERSION in main CmakeLists.txt file.
AMD Internal : [CPUPL-1630]
Change-Id: Iba1b7b6d014a4317de7cbaf42f812cad20111e4f
Details:
- Added framework code for GEMMT SUP.
- Implemented SUP for GEMMT using similar techniques as native path.
- Moved update routines to frame/util folder.
- Ported update routines for complex datatypes.
Change-Id: I17adfd0586d07f5a23dca6a07b2d48f4c9fcf71c
Signed-off-by: Meghana Vankadari <Meghana.Vankadari@amd.com>,
Dipal M Zambare <DipalMadhukar.Zambare@amd.com>,
Mangala V <managala.v@amd.com>
Details:
- Added new API Which Computes a matrix-matrix product with general matrices
but updates only the upper or lower triangular part of the result matrix.
cblas_?gemmt() and ?gemmt_().
- These routines are similar to the ?gemm routines, but they only access
and update a triangular part of the square result matrix.
- Added DGEMMT functionality by reusing GEMM kernels.
- Created a new folder for GEMMT under l3, and added GEMMT specific
framework code.
- Modified cntl_create routine to choose different macro kernel for
GEMMT.
- Added routines to copy lower/upper triangular part of a block to the
buffer.
- Defined BLIS, BLAS and CBLAS interface APIs for GEMMT.
- Added test_gemmt.c to test folder and Updated the Makefile.
- Added a macro 'CBLAS' in test_gemm.c to call CBLAS APIs.
Change-Id: Ie00c1a15b9c654b65c687a9ca781cbc6f9641791