blis/kernels at 4f96bb712e8f9f14e097a353f53c860e59cc60bc - blis - Public git mirror

amd/blis

mirror of https://github.com/amd/blis.git synced 2026-05-12 18:15:37 +00:00

Files

History

Vignesh Balasubramanian 2ad25a7180 ZGEMM kernel performance improvement for k=1 sizes:

The current implementation for handling zgemm exploits SIMD parallelism
along the k dimension. This would give great performance in cases of k
being large. But for input sizes with k=1, it is better to exploit SIMD
parallelism along the m and n dimensions, thereby giving better
performance. This commit does the same through loop reordering, by
loading column vectors from A.

AMD-Internal: [CPUPL-2236]
Change-Id: Ibfa29f271395497b6e2d0127c319ecb4b883d19f

2022-06-30 07:19:52 -04:00

..

New kernel set for Arm SVE using assembly (#396 )

2020-05-21 11:56:45 +05:30

Squash-merge 'pr' into 'squash'. (#457 )

2020-11-14 09:39:48 -06:00

avoid loading twice in armv8a gemm kernel (#403 )

2020-05-21 12:37:53 +05:30

Replaced use of bool_t type with C99 bool.

2020-08-03 11:27:13 +05:30

Remove UT-Austin from copyright headers' clause 3.

2018-12-04 14:31:06 -06:00

Added a dummy file to kernels/generic.

2017-11-21 12:34:20 -06:00

BLIS : Compiler warning fixes

2021-11-12 08:58:52 +05:30

Remove UT-Austin from copyright headers' clause 3.

2018-12-04 14:31:06 -06:00

Type saga continues; fixed sgemm ukernel signature.

2020-09-12 17:48:15 -05:00

Remove UT-Austin from copyright headers' clause 3.

2018-12-04 14:31:06 -06:00

Optionally disable trsm diagonal pre-inversion.

2020-12-04 16:08:15 -06:00

BLIS library porting on to Windows:

2020-06-16 18:29:00 +05:30

Remove UT-Austin from copyright headers' clause 3.

2018-12-04 14:31:06 -06:00

Add POWER10 support to BLIS (#450 )

2020-09-29 16:52:18 -05:00

Fixed bug in power10 microkernel I/O. (#488 )

2021-03-30 19:07:42 -05:00

Remove UT-Austin from copyright headers' clause 3.

2018-12-04 14:31:06 -06:00

Added support for AVX512 for Windows and AMAVX

2022-06-08 11:09:48 +05:30

ZGEMM kernel performance improvement for k=1 sizes:

2022-06-30 07:19:52 -04:00

BLIS:merge:

2021-04-27 11:09:48 +05:30

Added support for zen3 configuration

2020-07-22 18:24:26 +05:30

DAMAXV AXX512 micro kernel bug fix.

2022-06-13 10:52:53 +05:30

CMakeLists.txt

Added support for AVX512 for Windows and AMAVX

2022-06-08 11:09:48 +05:30