Files
blis/config
Kiran Varaganti 201db7883c Integrated 32x6 DGEMM kernel for zen4 and its related changes are added.
Details:
- Now AOCL BLIS uses AX512 - 32x6 DGEMM kernel for native code path.
  Thanks to Moore, Branden <Branden.Moore@amd.com> for suggesting and
  implementing these optimizations.
- In the initial version of 32x6 DGEMM kernel, to broadcast elements of B packed
  we perform load into xmm (2 elements), broadcast into zmm from xmmm and then to get the
  next element, we do vpermilpd(xmm). This logic is replaced with direct broadcast from
  memory, since the elements of Bpack are stored contiguously, the first broadcast fetches
  the cacheline and then subsequent broadcasts happen faster. We use two registers for broadcast
  and interleave broadcast operation with FMAs to hide any memory latencies.
- Native dTRSM uses 16x14 dgemm - therefore we need to override the default blkszs (MR,NR,..)
  when executing trsm. we call bli_zen4_override_trsm_blkszs(cntx_local) on a local cntx_t object
  for double data-type as well in the function bli_trsm_front(), bli_trsm_xx_ker_var2, xx = {ll,lu,rl,ru}.
  Renamed "BLIS_GEMM_AVX2_UKR" to "BLIS_GEMM_FOR_TRSM_UKR" and in the bli_cntx_init_zen4() we replaced
  dgemm kernel for TRSM with 16x14 dgemm kernel.
- New packm kernels - 16xk, 24xk and 32xk are added.
- New 32xk packm reference kernel is added in bli_packm_cxk_ref.c and it is
  enabled for zen4 config (bli_dpackm_32xk_zen4_ref() )
- Copyright year updated for modified files.
- cleaned up code for "zen" config - removed unused packm kernels declaration in kernels/zen/bli_kernels.h
- [SWLCSG-1374], [CPUPL-2918]

Change-Id: I576282382504b72072a6db068eabd164c8943627
2023-01-19 23:11:36 +05:30
..
2022-09-20 06:05:01 -04:00
2021-04-27 11:09:48 +05:30
2021-04-27 11:09:48 +05:30
2023-01-09 04:34:52 -05:00
2022-10-14 12:43:35 +05:30

For more information on sub-configurations and configuration families in BLIS, please read the Configuration Guide, which can be viewed in markdown-rendered form from the BLIS wiki page.

If you don't have time, or are impatient, take a look at the config_registry file in the top-level directory of the BLIS distribution. It contains a grammar-like mapping of configuration names, or families, to sub-configurations, which may be other families. Keep in mind that the / notation:

<config>: <config>/<name>

means that the kernel set associated with <name> should be made available to the configuration <config> if <config> is targeted at configure-time. (Some configurations borrow kernels from other configurations, and this is how we specify that requirement.)