mirror of
https://github.com/amd/blis.git
synced 2026-05-05 15:01:13 +00:00
BLAS Extension API - ?gemm_compute()
- Added support for 2 new APIs: 1. sgemm_compute() 2. dgemm_compute() These are dependent on the ?gemm_pack_get_size() and ?gemm_pack() APIs. - ?gemm_compute() takes the packed matrix buffer (represented by the packed matrix identifier) and performs the GEMM operation: C := A * B + beta * C. - Whenever the kernel storage preference and the matrix storage scheme isn't matching, and the respective matrix being loaded isn't packed either, on-the-go packing has been enabled for such cases to pack that matrix. - Note: If both the matrices are packed using the ?gemm_pack() API, it is the responsibility of the user to pack only one matrix with alpha scalar and the other with a unit scalar. - Note: Support is presently limited to Single Thread only. Both, pack and compute APIs are forced to take n_threads=1. AMD-Internal: [CPUPL-3560] Change-Id: I825d98a0a5038d31668d2a4b84b3ccc204e6c158
This commit is contained in:
committed by
Arnav Sharma
parent
81161066e5
commit
c8f14edcf5
@@ -60,6 +60,9 @@
|
||||
// Include the pack full thread decorator and related definitions and prototypes
|
||||
// for the pack code path.
|
||||
#include "bli_pack_full_decor.h"
|
||||
// Include the level-3 thread decorator and related definitions and prototypes
|
||||
// for the compute code path.
|
||||
#include "bli_l3_compute_decor.h"
|
||||
|
||||
// Initialization-related prototypes.
|
||||
void bli_thread_init( void );
|
||||
|
||||
Reference in New Issue
Block a user