blis/frame/include at e6cc2a3e227a3f38f4a3c9edf22d003131878fc2 - blis - Public git mirror

amd/blis

mirror of https://github.com/amd/blis.git synced 2026-05-24 10:24:34 +00:00

Files

History

Mangala V e6cc2a3e22 ZGEMMT SUP Optimizations for AVX512

Existing Design:
 - GEMM AVX2 kernel performs computation and updates temporary C buffer
 - Portion of temporary C buffer is copied to output C buffer
   based on UPLO parameter
 - For diagonal blocks, using GEMM kernels is not efficient

New Design: Implemented in current patch when UPLO='L'
 - GEMMT kernel used for computation, temporary buffer is not required.
 - Only required elements are computed using mask load store for all
   fringe cases
 - Exception: AVX2 code path is used when storage format is RRC, CRR, CRC

- AOCL-Dynamic is added based on dimension
- Check for AVX platform is added in SUP interface, It returns to
  native implementation if hardware doesnot support AVX platform
- SUP ref_var2m is expanded for dcomplex datatype to avoid condition
  check which exists for double datatype

AMD_Internal: [CPUPL-5006]

Change-Id: I3e21404b732b8f2df9cbdba394303752fdf36286

2024-05-07 23:00:29 +05:30

..

Declare/define static functions via BLIS_INLINE.

2020-08-03 11:23:40 +05:30

bli_arch_config_pre.h

Removed export macros from all internal prototypes.

2020-08-03 11:47:18 +05:30

bli_arch_config.h

BLIS: Implement zen5 sub-configuration

2024-04-12 07:26:31 -04:00

bli_blas_macro_defs.h

Remove UT-Austin from copyright headers' clause 3.

2018-12-04 14:31:06 -06:00

bli_builtin_macro_defs.h

Rewrote reference kernels to use #pragma omp simd.

2019-01-24 17:23:18 -06:00

bli_complex_macro_defs.h

Remove UT-Austin from copyright headers' clause 3.

2018-12-04 14:31:06 -06:00

bli_config_macro_defs.h

Code cleanup: AMD copyright notice

2023-11-23 08:54:31 -05:00

bli_error_macro_defs.h

Remove UT-Austin from copyright headers' clause 3.

2018-12-04 14:31:06 -06:00

bli_extern_defs.h

Export functions without def file (#303 )

2019-03-11 19:05:32 -05:00

bli_f2c.h

Allow lesser Makefiles to reference installed BLIS.

2018-08-25 20:12:36 -05:00

bli_genarray_macro_defs.h

Code cleanup: AMD copyright notice

2023-11-23 08:54:31 -05:00

bli_gentdef_macro_defs.h

Remove UT-Austin from copyright headers' clause 3.

2018-12-04 14:31:06 -06:00

bli_gentfunc_macro_defs.h

ZGEMMT SUP Optimizations for AVX512

2024-05-07 23:00:29 +05:30

bli_gentprot_macro_defs.h

Code cleanup: AMD copyright notice

2023-11-23 08:54:31 -05:00

bli_kernel_macro_defs.h

Removed dependency on AOCL_BLIS_ZEN for TRSM blocksizes.

2021-08-16 00:12:33 -04:00

bli_lang_defs.h

Code cleanup: AMD copyright notice

2023-11-23 08:54:31 -05:00

bli_macro_defs.h

Code cleanup: AMD copyright notice

2023-11-23 08:54:31 -05:00

bli_misc_macro_defs.h

Replaced use of bool_t type with C99 bool.

2020-08-03 11:27:13 +05:30

bli_oapi_ba.h

Use extra #undef when including ba/ex API headers.

2021-05-13 15:23:22 -05:00

bli_oapi_ex.h

Use extra #undef when including ba/ex API headers.

2021-05-13 15:23:22 -05:00

bli_oapi_macro_defs.h

Remove UT-Austin from copyright headers' clause 3.

2018-12-04 14:31:06 -06:00

bli_obj_macro_defs.h

Code cleanup: AMD copyright notice

2023-11-23 08:54:31 -05:00

bli_param_macro_defs.h

Code cleanup: AMD copyright notice

2023-11-23 08:54:31 -05:00

bli_pragma_macro_defs.h

Generalized ref kernels' pragma omp simd usage.

2019-02-12 16:01:28 -06:00

bli_sbox.h

Remove UT-Austin from copyright headers' clause 3.

2018-12-04 14:31:06 -06:00

bli_scalar_macro_defs.h

Added support for pre-broadcast when packing B.

2019-09-17 17:42:10 -05:00

bli_system.h

Code cleanup: AMD copyright notice

2023-11-23 08:54:31 -05:00

bli_tapi_ba.h

Use extra #undef when including ba/ex API headers.

2021-05-13 15:23:22 -05:00

bli_tapi_ex.h

Use extra #undef when including ba/ex API headers.

2021-05-13 15:23:22 -05:00

bli_tapi_macro_defs.h

Remove UT-Austin from copyright headers' clause 3.

2018-12-04 14:31:06 -06:00

bli_trsm_small_ref.h

Added DTRSM Small Path AVX512 based LLNN/LUTN Variant Kernels

2023-04-07 08:50:28 +00:00

bli_type_defs.h

BLIS: zen5 cpuid and arch changes

2024-01-17 11:41:15 -05:00

bli_x86_asm_macros.h

Code cleanup: AMD copyright notice

2023-11-23 08:54:31 -05:00

bli_xapi_undef.h

Minor preprocessor/header cleanup.

2021-05-13 13:55:11 -05:00

blis.h

CMake:Added support for ADDON(aocl_gemm) on Windows

2024-03-14 07:57:02 -04:00