Optimized AXPBYV Kernel using AVX2 Intrinsics

Details:
- Intrinsic implementation of axpbyv for AVX2
- Bench written for axpbyv
- Added definitions in zen contexts

AMD-Internal: [CPUPL-1963]

Change-Id: I9bc21a6170f5c944eb6e9e9f0e994b9992f8b539
This commit is contained in:
Arnav Sharma
2021-12-21 16:49:11 +05:30
committed by Dipal M Zambare
parent d687bd36ea
commit 86690f9fd3

View File

@@ -57,6 +57,16 @@ AXPBYV_KER_PROT( dcomplex, z, axpbyv_zen_int )
AXPBYV_KER_PROT( float, s, axpbyv_zen_int10 )
AXPBYV_KER_PROT( double, d, axpbyv_zen_int10 )
// axpbyv (intrinsics)
AXPBYV_KER_PROT( float, s, axpbyv_zen_int )
AXPBYV_KER_PROT( double, d, axpbyv_zen_int )
AXPBYV_KER_PROT( scomplex, c, axpbyv_zen_int )
AXPBYV_KER_PROT( dcomplex, z, axpbyv_zen_int )
// axpbyv (intrinsics, unrolled x10)
AXPBYV_KER_PROT( float, s, axpbyv_zen_int10 )
AXPBYV_KER_PROT( double, d, axpbyv_zen_int10 )
// axpyv (intrinsics)
AXPYV_KER_PROT( float, s, axpyv_zen_int )
AXPYV_KER_PROT( double, d, axpyv_zen_int )