Details:
- For a variable x, Using address of x in an instruction throws
exception if the difference between &x and access position is
larger than 2 GiB. To solve this issue all variables are stored
within the JIT code section and are accessed using relative addressing.
- Fixed a bug in B matrix pack function for s8s8s32os32 API.
- Fixed a bug in JIT code to apply bias on col-major matrices.
AMD-Internal: [SWLCSG-2820]
Change-Id: I82f117a0422c794cb9b1a4d65a89d60de4adfd96
SWISH post-op computes swish(x) = x / (1 + exp(-1 * alpha * x)).
SiLU = SWISH with alpha = 1. Adding the support for swish in JIT
based BF16 kernels.
AMD-Internal: [SWLCSG-2387]
Change-Id: I9eea0c801f5f067a5cfbd2941bc991708b86e45e
Details:
- Added new folder named JIT/ under addon/aocl_gemm/. This folder
will contain all the JIT related code.
- Modified lpgemm_cntx_init code to generate main and fringe kernels
for 6x64 bf16 microkernel and store function pointers to all the
generated kernels in a global function pointer array. This happens
only when gcc version is < 11.2
- When gcc version < 11.2, microkernel uses JIT-generated kernels.
otherwise, microkernel uses the intrinsics based implementation.
AMD-Internal: [SWLCSG-2622]
Change-Id: I16256c797b2546a8cd2049680001947346260461