Commit Graph

4 Commits

Author SHA1 Message Date
Edward Smyth
ed5010d65b Code cleanup: AMD copyright notice
Standardize format of AMD copyright notice.

AMD-Internal: [CPUPL-3519]
Change-Id: I98530e58138765e5cd5bc0c97500506801eb0bf0
2023-11-23 08:54:31 -05:00
mkadavil
8dff49837d Lpgemm source restructuring to support amdzen config.
-Currently lpgemm can only be built using either zen3 or zen4 config.
The lpgemm kernel code is re-structured to support amdzen, and thus
multi machine deployment.
-The micro-kernel calls (gemm and pack) are currently hardcoded in the
lpgemm framework. This is removed and a new lpgemm_cntx based dispatch
mechanism is designed to support runtime configurability for
micro-kernels.

AMD-Internal: [CPUPL-2965]
Change-Id: I4bbcb4e5db767def1663caf5481f0b4c988149ef
2023-02-21 08:35:38 -05:00
Harihara Sudhan S
60de0a1856 Multithreading and support for unpacked B matrix in u8s8s16os16
Fucntionality - When the B matrix is not reordered before the
u8s8s16os16 compute kernel call packing of B matrix is done as
part of the five loop algorithm. The state of B matrix (packed
or unpacked) is given as an user input.

	- Packing of B matrix is done as part of the five loop
	  compute.
	- Temprorary buffer for pack B is allocated in the five
	  loop algorithm
	- Multithreading for computation kernel
	- Configuration constants for u8s8s16os16 are part of the
	  lpgemm config

AMD-Internal: [CPUPL-2171]

Change-Id: I22b4f0ec7fc29a2add4be0cff7d75f92dd3e60b8
2022-08-05 19:28:37 +05:30
Harihara Sudhan S
e5d4fc2a70 Added low precision GEMM (u8s8s16os16)
Feature Addition : Added low precision GEMM to addon. The
kernel takes unsigned int8 and signed int8 as inputs and
performs GEMM operation. The intermediate accumulation and
output are in signed int16.

	- The compute kernel will perform computation only
	  if B matrix reordered to suit the usage of AVX2
	  instruction vpmaddubsw.
	- Kernel for packing the B matrix is provided.
	- LPGEMM bench code was modified to test the
	  performance and accuracy of the new variant.

AMD-Internal: [CPUPL-2171]

Change-Id: Id9a6d90b79f4bf82fb2e2f3093974dbf37275f9b
2022-08-02 02:20:00 -04:00