Files
blis/bench/bench_aocl_gemm/bench_utils_input.txt
mkadavil 3572baa9d3 aocl_softmax_f32 api's for softmax computation as part of lpgemm.
-Softmax is often used as the last activation function in a neural
network - softmax(xi) = exp(xi)/(exp(x0) + exp(x1) + ... + exp(xn))).
This step happens after the final low precision gemm computation,
and it helps to have the softmax functionality that can be invoked
as part of the lpgemm workflow. In order to support this, a new api,
aocl_softmax_f32 is introduced as part of aocl_gemm. This api
computes element-wise softmax of a matrix/vector of floats. This api
invokes ISA specific vectorized micro-kernels (vectorized only when
incx=1), and a cntx based mechanism (similar to lpgemm_cntx) is used
to dispatch to the appropriate kernel.

AMD-Internal: [CPUPL-3247]
Change-Id: If15880360947435985fa87b6436e475571e4684a
2023-04-21 05:26:08 -04:00

34 lines
618 B
Plaintext

f32_softmax 1 1
f32_softmax 2 1
f32_softmax 4 1
f32_softmax 21 1
f32_softmax 64 1
f32_gelu_tanh 1 1
f32_gelu_tanh 2 1
f32_gelu_tanh 8 1
f32_gelu_tanh 16 1
f32_gelu_tanh 21 1
f32_gelu_tanh 64 1
f32_gelu_tanh 1029 1
f32_gelu_erf 1 1
f32_gelu_erf 2 1
f32_gelu_erf 8 1
f32_gelu_erf 16 1
f32_gelu_erf 21 1
f32_gelu_erf 64 1
f32_gelu_erf 1029 1
f32_gelu_tanh 1 9
f32_gelu_tanh 2 9
f32_gelu_tanh 8 9
f32_gelu_tanh 16 1024
f32_gelu_tanh 21 1024
f32_gelu_tanh 64 1024
f32_gelu_tanh 1029 512
f32_gelu_erf 1 9
f32_gelu_erf 2 9
f32_gelu_erf 8 9
f32_gelu_erf 16 1024
f32_gelu_erf 21 1024
f32_gelu_erf 64 1024
f32_gelu_erf 1029 512