blis/frame/include at 8b486e8d149b5928db9eebb1d534954b4ef3aee4 - blis - Public git mirror

amd/blis

mirror of https://github.com/amd/blis.git synced 2026-05-24 18:34:40 +00:00

Files

History

Shubham Sharma 16c56e0101 Added 24x8 triangular kernels for DGEMMT SUP

- In order to reuse 24x8 AVX512 DGEMM SUP kernels,
   24x8 triangular AVX512 DGEMMT SUP kernels are added.
 - Since the LCM of MR(24) and NR(8) is 24, therefore the diagonal
   pattern repeats every 24x24 block of C. To cover this 24x24 block,
   3 kernels are needed for one variant of DGEMMT. A total of 6
   kernels are needed to cover both upper and lower variants.
 - In order to maximize code reuse, the 24x8 kernels are broken
   into two parts, 8x8 diagonal GEMM and 16x8 full GEMM. The 8x8
   diagonal GEMM is computed by 8x8 diagonal kernel, and 16x8
   full GEMM part is computed by 24x8 DGEMM SUP kernel.
 - Changes are made in framework to enable the use of these kernels.

AMD-Internal: [CPUPL-5338]
Change-Id: I8e7007031e906f786b0c4fe12377ee439075207a

2024-07-22 12:02:30 -04:00

..

Removed support for 3m, 4m induced methods.

2021-10-28 16:05:43 -05:00

bli_arch_config_pre.h

Allow use of 1m with mixing of row/col-pref ukrs.

2021-10-13 14:15:38 -05:00

bli_arch_config.h

Merge commit '81e10346' into amd-main

2024-06-25 05:48:46 -04:00

bli_blas_macro_defs.h

Remove UT-Austin from copyright headers' clause 3.

2018-12-04 14:31:06 -06:00

bli_builtin_macro_defs.h

Rewrote reference kernels to use #pragma omp simd.

2019-01-24 17:23:18 -06:00

bli_complex_macro_defs.h

Remove UT-Austin from copyright headers' clause 3.

2018-12-04 14:31:06 -06:00

bli_config_macro_defs.h

Code cleanup: AMD copyright notice

2023-11-23 08:54:31 -05:00

bli_error_macro_defs.h

bli_error: more cleanup on the error strings array

2021-09-20 10:39:05 +02:00

bli_extern_defs.h

Export functions without def file (#303 )

2019-03-11 19:05:32 -05:00

bli_f2c.h

Allow lesser Makefiles to reference installed BLIS.

2018-08-25 20:12:36 -05:00

bli_genarray_macro_defs.h

Code cleanup: AMD copyright notice

2023-11-23 08:54:31 -05:00

bli_gentdef_macro_defs.h

Remove UT-Austin from copyright headers' clause 3.

2018-12-04 14:31:06 -06:00

bli_gentfunc_macro_defs.h

Added 24x8 triangular kernels for DGEMMT SUP

2024-07-22 12:02:30 -04:00

bli_gentprot_macro_defs.h

Code cleanup: AMD copyright notice

2023-11-23 08:54:31 -05:00

bli_kernel_macro_defs.h

Removed dependency on AOCL_BLIS_ZEN for TRSM blocksizes.

2021-08-16 00:12:33 -04:00

bli_lang_defs.h

Code cleanup: AMD copyright notice

2023-11-23 08:54:31 -05:00

bli_macro_defs.h

Code cleanup: AMD copyright notice

2023-11-23 08:54:31 -05:00

bli_misc_macro_defs.h

Replaced use of bool_t type with C99 bool.

2020-08-03 11:27:13 +05:30

bli_oapi_ba.h

Use extra #undef when including ba/ex API headers.

2021-05-13 15:23:22 -05:00

bli_oapi_ex.h

Use extra #undef when including ba/ex API headers.

2021-05-13 15:23:22 -05:00

bli_oapi_macro_defs.h

Remove UT-Austin from copyright headers' clause 3.

2018-12-04 14:31:06 -06:00

bli_obj_macro_defs.h

Code cleanup: AMD copyright notice

2023-11-23 08:54:31 -05:00

bli_param_macro_defs.h

Merge commit 'cfa3db3f' into amd-main

2024-07-08 06:09:11 -04:00

bli_pragma_macro_defs.h

Generalized ref kernels' pragma omp simd usage.

2019-02-12 16:01:28 -06:00

bli_sbox.h

Remove UT-Austin from copyright headers' clause 3.

2018-12-04 14:31:06 -06:00

bli_scalar_macro_defs.h

Removed support for 3m, 4m induced methods.

2021-10-28 16:05:43 -05:00

bli_system.h

Code cleanup: AMD copyright notice

2023-11-23 08:54:31 -05:00

bli_tapi_ba.h

Use extra #undef when including ba/ex API headers.

2021-05-13 15:23:22 -05:00

bli_tapi_ex.h

Use extra #undef when including ba/ex API headers.

2021-05-13 15:23:22 -05:00

bli_tapi_macro_defs.h

Remove UT-Austin from copyright headers' clause 3.

2018-12-04 14:31:06 -06:00

bli_trsm_small_ref.h

Added DTRSM Small Path AVX512 based LLNN/LUTN Variant Kernels

2023-04-07 08:50:28 +00:00

bli_type_defs.h

Merge commit 'cfa3db3f' into amd-main

2024-07-08 06:09:11 -04:00

bli_x86_asm_macros.h

DGEMM optimizations for Turin Classic

2024-07-09 07:53:27 -04:00

bli_xapi_undef.h

Minor preprocessor/header cleanup.

2021-05-13 13:55:11 -05:00

blis.h

CMake:Added support for ADDON(aocl_gemm) on Windows

2024-03-14 07:57:02 -04:00