mirror of
https://github.com/amd/blis.git
synced 2026-05-05 15:01:13 +00:00
Multi data type downscaling support for u8s8s16 - u8s8s16<u8|s8>
Downscaling is used when GEMM output is accumulated at a higher precision and needs to be converted to a lower precision afterwards. Currently the u8s8s16 flavor of api only supports downscaling to s8 (int8_t) via aocl_gemm_u8s8s16os8 after results are accumulated at int16_t. LPGEMM is modified to support downscaling to different data types, like u8, s16, apart from s8. The framework (5 loop) passes the downscale data type to the micro-kernels. Within the micro-kernel, based on the downscale type, appropriate beta scaling and output buffer store logic is executed. This support is only enabled for u8s8s16 flavor of api's. The LPGEMM bench is also modified to support passing downscale data type for performance and accuracy testing. AMD-Internal: [SWLCSG-2313] Change-Id: I723d0802baf8649e5e41236b239880a6043bfd30
This commit is contained in:
committed by
MithunMohan KadavilMadanaMohanan
parent
a6a67fea2d
commit
ea0324ab95
@@ -62,7 +62,7 @@ void lpgemm_rowvar_ ## LP_SFX \
|
||||
lpgemm_thrinfo_t* thread, \
|
||||
lpgemm_cntx_t* lcntx, \
|
||||
lpgemm_post_op* post_op_list, \
|
||||
bool c_downscale \
|
||||
AOCL_STORAGE_TYPE c_downscale \
|
||||
) \
|
||||
|
||||
LPGEMM_5LOOP(uint8_t,int8_t,int32_t,u8s8s32o32);
|
||||
|
||||
Reference in New Issue
Block a user