mirror of
https://github.com/amd/blis.git
synced 2026-04-20 15:48:50 +00:00
Added AVX512 and AVX2 FP32 RD Kernels
- Added FP32 RD (dot-product) kernels for both, AVX512 and AVX2 ISAs.
- The FP32 AVX512 primary RD kernel has blocking of dimensions 6x64
(MRxNR) whereas it is 6x16 (MRxNR) for the AVX2 primary RD kernel.
- Updatd f32 framework to accomodate rd kernels in case of B trans
with thresholds
- Updated data gen python script
TODO:
- Post-Ops not yet supported.
Change-Id: Ibf282741f58a1446321273d5b8044db993f23714
This commit is contained in:
committed by
Nallani Bhaskar
parent
e0b86c69af
commit
c68c258fad
@@ -56,5 +56,5 @@ for stor in ["r"]:
|
||||
post_op += "=" + "na"
|
||||
else:
|
||||
post_op += "=" + output_type
|
||||
ofile.write(chars + " " + dims + " " + op + ":" + post_op + "\n")
|
||||
ofile.write(chars + " " + dims + " " + op + ":" + post_op + "\n")
|
||||
ofile.close()
|
||||
|
||||
Reference in New Issue
Block a user