blis/addon at e5d4fc2a70d66a49a053b01c01a8d4638a4f783a - blis

amd/blis

mirror of https://github.com/amd/blis.git synced 2026-05-11 09:39:59 +00:00

Files

Harihara Sudhan S e5d4fc2a70 Added low precision GEMM (u8s8s16os16)

Feature Addition : Added low precision GEMM to addon. The
kernel takes unsigned int8 and signed int8 as inputs and
performs GEMM operation. The intermediate accumulation and
output are in signed int16.

	- The compute kernel will perform computation only
	  if B matrix reordered to suit the usage of AVX2
	  instruction vpmaddubsw.
	- Kernel for packing the B matrix is provided.
	- LPGEMM bench code was modified to test the
	  performance and accuracy of the new variant.

AMD-Internal: [CPUPL-2171]

Change-Id: Id9a6d90b79f4bf82fb2e2f3093974dbf37275f9b

2022-08-02 02:20:00 -04:00

aocl_gemm

Added low precision GEMM (u8s8s16os16)

2022-08-02 02:20:00 -04:00

gemmd

Added support for addons.

2022-03-31 12:03:27 +05:30