blis/frame/compat at 5bdf5e2aaa5a36282cd5d7829c7e9ee4e5a8c569 - blis

amd/blis

mirror of https://github.com/amd/blis.git synced 2026-05-13 18:52:14 +00:00

Files

Vignesh Balasubramanian 758ec3b5ca ZGEMM optimizations for cases with k = 1

- Implemented bli_zgemm_4x4_avx2_k1_nn( ... ) kernel to replace
  bli_zgemm_4x6_avx2_k1_nn( ... ) kernel in the BLAS layer of
  ZGEMM. The kernel is built for handling the GEMM computation
  with inputs having k = 1, and the transpose values for A and
  B as N.

- The kernel dimension has been changed from 4x6 to 4x4,
  due to the following reasons :

  - The 1xNR block of B in the n-loop can be reused over multiple
    MRx1 blocks of A in the m-loop during computation. Similar
    analogy exists for the fringe cases.

  - Every 1xNR block of B was scaled with alpha and stored in
    registers before traversing in the m-dimension. Similar change
    was done for fringe cases in n-dimension.

  - These registers should not be modified during compute, hence
    the kernel dimension was changed from 4x6 to 4x4.

- The check for early exit(with regards to BLAS mandate) has been
  removed, since it is already present in the BLAS layer.

- The check for parallel ZGEMM has been moved post the redirection to
  this kernel, since the kernel is single-threaded.

- The bli_kernels_zen.h file was updated with the new kernel signature.

AMD-Internal: [CPUPL-3622]
Change-Id: Iaf03b00d5075dd74cc412290d77a401986ba0bea

2023-08-07 15:10:08 +05:30

attic

AOCL Windows: 3.1 BLIS changes

2021-03-08 19:04:17 +05:30

blis

BLIS: Nested parallelism issues

2022-10-21 07:38:39 -04:00

cblas

BLIS: Incorrect ifdef in cblas.h and cblas_f77.h

2023-06-07 06:52:57 -04:00

check

Fixed Compilation Fails when configured with --disable-blas

2023-03-23 06:11:52 -04:00

f2c

Fixed Compilation Fails when configured with --disable-blas

2023-03-23 06:11:52 -04:00

bla_amax_amd.c

Code cleanup: spelling corrections

2023-04-19 12:44:56 -04:00

bla_amax.c

Fixed Compilation Fails when configured with --disable-blas

2023-03-23 06:11:52 -04:00

bla_amax.h

Fixed Compilation Fails when configured with --disable-blas

2023-03-23 06:11:52 -04:00

bla_amin.c

Fixed Compilation Fails when configured with --disable-blas

2023-03-23 06:11:52 -04:00

bla_amin.h

Fixed Compilation Fails when configured with --disable-blas

2023-03-23 06:11:52 -04:00

bla_asum.c

Fixed Compilation Fails when configured with --disable-blas

2023-03-23 06:11:52 -04:00

bla_asum.h

Fixed Compilation Fails when configured with --disable-blas

2023-03-23 06:11:52 -04:00

bla_axpby.c

Fixed Compilation Fails when configured with --disable-blas

2023-03-23 06:11:52 -04:00

bla_axpby.h

Fixed Compilation Fails when configured with --disable-blas

2023-03-23 06:11:52 -04:00

bla_axpy_amd.c

Partial completion of work in L1 APIs

2023-04-27 15:17:26 +05:30

bla_axpy.c

Fixed Compilation Fails when configured with --disable-blas

2023-03-23 06:11:52 -04:00

bla_axpy.h

Fixed Compilation Fails when configured with --disable-blas

2023-03-23 06:11:52 -04:00

bla_copy_amd.c

Code cleanup: spelling corrections

2023-04-19 12:44:56 -04:00

bla_copy.c

Fixed compilation errors for generic configuration

2023-04-18 00:27:05 -04:00

bla_copy.h

Fixed Compilation Fails when configured with --disable-blas

2023-03-23 06:11:52 -04:00

bla_dot_amd.c

Incorrect accumulation of results in DDOTV

2023-05-04 10:44:15 +05:30

bla_dot.c

Fixed Compilation Fails when configured with --disable-blas

2023-03-23 06:11:52 -04:00

bla_dot.h

Fixed Compilation Fails when configured with --disable-blas

2023-03-23 06:11:52 -04:00

bla_gemm3m.c

Fixed Compilation Fails when configured with --disable-blas

2023-03-23 06:11:52 -04:00

bla_gemm3m.h

Fixed Compilation Fails when configured with --disable-blas

2023-03-23 06:11:52 -04:00

bla_gemm_amd.c

ZGEMM optimizations for cases with k = 1

2023-08-07 15:10:08 +05:30

bla_gemm_batch.c

Fixed Compilation Fails when configured with --disable-blas

2023-03-23 06:11:52 -04:00

bla_gemm_batch.h

Fixed Compilation Fails when configured with --disable-blas

2023-03-23 06:11:52 -04:00

bla_gemm.c

GEMM: Early return when alpha = zero

2023-03-23 09:24:41 -04:00

bla_gemm.h

Fixed Compilation Fails when configured with --disable-blas

2023-03-23 06:11:52 -04:00

bla_gemmt.c

Added NT in DTL logs for GEMMT, TRSM and NRM2

2023-07-27 05:15:08 -04:00

bla_gemmt.h

Fixed Compilation Fails when configured with --disable-blas

2023-03-23 06:11:52 -04:00

bla_gemv_amd.c

Code cleanup: spelling corrections

2023-04-19 12:44:56 -04:00

bla_gemv.c

Fixed Compilation Fails when configured with --disable-blas

2023-03-23 06:11:52 -04:00

bla_gemv.h

Fixed Compilation Fails when configured with --disable-blas

2023-03-23 06:11:52 -04:00

bla_ger.c

Fixed Compilation Fails when configured with --disable-blas

2023-03-23 06:11:52 -04:00

bla_ger.h

Fixed Compilation Fails when configured with --disable-blas

2023-03-23 06:11:52 -04:00

bla_hemm.c

BLAS compliance for Level-3 routines

2023-03-29 04:36:00 -05:00

bla_hemm.h

Fixed Compilation Fails when configured with --disable-blas

2023-03-23 06:11:52 -04:00

bla_hemv.c

Fixed Compilation Fails when configured with --disable-blas

2023-03-23 06:11:52 -04:00

bla_hemv.h

Fixed Compilation Fails when configured with --disable-blas

2023-03-23 06:11:52 -04:00

bla_her2.c

Fixed Compilation Fails when configured with --disable-blas

2023-03-23 06:11:52 -04:00

bla_her2.h

Fixed Compilation Fails when configured with --disable-blas

2023-03-23 06:11:52 -04:00

bla_her2k.c

BLAS compliance for Level-3 routines

2023-03-29 04:36:00 -05:00

bla_her2k.h

Fixed Compilation Fails when configured with --disable-blas

2023-03-23 06:11:52 -04:00

bla_her.c

Fixed Compilation Fails when configured with --disable-blas

2023-03-23 06:11:52 -04:00

bla_her.h

Fixed Compilation Fails when configured with --disable-blas

2023-03-23 06:11:52 -04:00

bla_herk.c

BLAS compliance for Level-3 routines

2023-03-29 04:36:00 -05:00

bla_herk.h

Fixed Compilation Fails when configured with --disable-blas

2023-03-23 06:11:52 -04:00

bla_imatcopy.c

Code cleanup: No newline at end of file

2023-04-21 10:02:48 -04:00

bla_imatcopy.h

Code cleanup: No newline at end of file

2023-04-21 10:02:48 -04:00

bla_nrm2.c

Added NT in DTL logs for GEMMT, TRSM and NRM2

2023-07-27 05:15:08 -04:00

bla_nrm2.h

Fixed Compilation Fails when configured with --disable-blas

2023-03-23 06:11:52 -04:00

bla_omatadd.c

Code cleanup: No newline at end of file

2023-04-21 10:02:48 -04:00

bla_omatadd.h

Added Blas interface for ?imatcopy, ?omatcopy, ?omatadd, ?omatcopy2

2020-11-18 12:55:36 +05:30

bla_omatcopy2.c

Code cleanup: No newline at end of file

2023-04-21 10:02:48 -04:00

bla_omatcopy2.h

Code cleanup: No newline at end of file

2023-04-21 10:02:48 -04:00

bla_omatcopy.c

Code cleanup: No newline at end of file

2023-04-21 10:02:48 -04:00

bla_omatcopy.h

Code cleanup: No newline at end of file

2023-04-21 10:02:48 -04:00

bla_scal_amd.c

Added AVX512 ZDSCALV kernel

2023-08-06 01:51:47 -04:00

bla_scal.c

Fixed Compilation Fails when configured with --disable-blas

2023-03-23 06:11:52 -04:00

bla_scal.h

Fixed Compilation Fails when configured with --disable-blas

2023-03-23 06:11:52 -04:00

bla_swap_amd.c

Code cleanup: spelling corrections

2023-04-19 12:44:56 -04:00

bla_swap.c

Fixed compilation errors for generic configuration

2023-04-18 00:27:05 -04:00

bla_swap.h

Fixed Compilation Fails when configured with --disable-blas

2023-03-23 06:11:52 -04:00

bla_symm.c

BLAS compliance for Level-3 routines

2023-03-29 04:36:00 -05:00

bla_symm.h

Fixed Compilation Fails when configured with --disable-blas

2023-03-23 06:11:52 -04:00

bla_symv.c

Fixed Compilation Fails when configured with --disable-blas

2023-03-23 06:11:52 -04:00

bla_symv.h

Fixed Compilation Fails when configured with --disable-blas

2023-03-23 06:11:52 -04:00

bla_syr2.c

Fixed Compilation Fails when configured with --disable-blas

2023-03-23 06:11:52 -04:00

bla_syr2.h

Fixed Compilation Fails when configured with --disable-blas

2023-03-23 06:11:52 -04:00

bla_syr2k.c

BLAS compliance for Level-3 routines

2023-03-29 04:36:00 -05:00

bla_syr2k.h

Fixed Compilation Fails when configured with --disable-blas

2023-03-23 06:11:52 -04:00

bla_syr.c

Fixed Compilation Fails when configured with --disable-blas

2023-03-23 06:11:52 -04:00

bla_syr.h

Fixed Compilation Fails when configured with --disable-blas

2023-03-23 06:11:52 -04:00

bla_syrk.c

BLAS compliance for Level-3 routines

2023-03-29 04:36:00 -05:00

bla_syrk.h

Fixed Compilation Fails when configured with --disable-blas

2023-03-23 06:11:52 -04:00

bla_trmm.c

BLAS compliance for Level-3 routines

2023-03-29 04:36:00 -05:00

bla_trmm.h

Fixed Compilation Fails when configured with --disable-blas

2023-03-23 06:11:52 -04:00

bla_trmv.c

Fixed Compilation Fails when configured with --disable-blas

2023-03-23 06:11:52 -04:00

bla_trmv.h

Fixed Compilation Fails when configured with --disable-blas

2023-03-23 06:11:52 -04:00

bla_trsm_amd.c

Added NT in DTL logs for GEMMT, TRSM and NRM2

2023-07-27 05:15:08 -04:00

bla_trsm.c

Fixed Compilation Fails when configured with --disable-blas

2023-03-23 06:11:52 -04:00

bla_trsm.h

Fixed Compilation Fails when configured with --disable-blas

2023-03-23 06:11:52 -04:00

bla_trsv.c

Fixed Compilation Fails when configured with --disable-blas

2023-03-23 06:11:52 -04:00

bla_trsv.h

Fixed Compilation Fails when configured with --disable-blas

2023-03-23 06:11:52 -04:00

bli_blas.h

Fixed Compilation Fails when configured with --disable-blas

2023-03-23 06:11:52 -04:00

CMakeLists.txt

Added support for AVX512 for Windows and AMAVX

2022-06-08 11:09:48 +05:30