Files
blis/kernels
Abhiram S 7787bc79b1 Level1 samaxv: AVX512 implementation
Details:
 1. Unrolled by a factor 5. This gave around 1GFLOPS gain
 2. Changed CMP to subs and remove nan. CMP uses a lot of
    compare, which is higher in latency and more number of
    instructions. Replacing with subs and remove nan
    reduced it to 3 instructions and lighter ones.
 3. Added remove nan function.
 4. Added AVX512 definition in skx context.
 5. Disabled code in AMAXV kernel depending on AVX512 flag
    exists or not

Change-Id: I191725a55bc33edf8d537156292cf997d6a5fe35
2021-09-27 16:10:08 +05:30
..
2020-09-29 16:52:18 -05:00
2021-09-27 16:10:08 +05:30
2021-04-27 11:09:48 +05:30
2020-07-22 18:24:26 +05:30
2021-03-08 19:04:17 +05:30