Files
blis/kernels/zen
Abhiram S 7787bc79b1 Level1 samaxv: AVX512 implementation
Details:
 1. Unrolled by a factor 5. This gave around 1GFLOPS gain
 2. Changed CMP to subs and remove nan. CMP uses a lot of
    compare, which is higher in latency and more number of
    instructions. Replacing with subs and remove nan
    reduced it to 3 instructions and lighter ones.
 3. Added remove nan function.
 4. Added AVX512 definition in skx context.
 5. Disabled code in AMAXV kernel depending on AVX512 flag
    exists or not

Change-Id: I191725a55bc33edf8d537156292cf997d6a5fe35
2021-09-27 16:10:08 +05:30
..
2021-09-27 16:10:08 +05:30
2021-06-15 16:49:08 +05:30
2021-03-08 19:04:17 +05:30