mirror of
https://github.com/amd/blis.git
synced 2026-05-12 18:15:37 +00:00
[CPUPL-858] Packing kernels for dgemm 6x8 kernel are added explicitly for zen2 configuration. Apart from generic packing kernels used by level-3 routines and for all combinations of the input parameters, introduced DGEMM specific packing kernels for the case op(A) & op(B) is no transpose. This helps us to vectorize these packing kernels and eliminate un-necessary branch conditional checks. The packed kernels are also optimized at the boundary. These boundary condition optimization help when the input matrix dimensions "m" and "n" are not multiples of register block-sizes "MR & NR". Typical DGEMM operation is C = beta*C + alpha *op(A) * op(B). Kindly note the multiplication with alpha is handled inside kernel, hence in these dgemm packing routines alpha is always consider 1.0. These routines are "bli_dpackm_8xk_nn_zen" & "bli_dpackm_6xk_nn_zen". The generic packing routines are "bli_dpackm_6xk_gen_zen" & bli_dpackm_8xk_gen_zen". These routines are enabled from "bli_cntx_init_zen2()" through bli_cntx_set_packm_kers(). In this checkout wthe generic packing kernels are enabled by default". Later will introduce run-time mechanism to change these packing kernels based on the DGEMM input parameters. Change-Id: I079b4dce0757d558224cb8c55d024bfea6a4de91
For more information on sub-configurations and configuration families in BLIS, please read the Configuration Guide, which can be viewed in markdown-rendered form from the BLIS wiki page.
If you don't have time, or are impatient, take a look at the config_registry
file in the top-level directory of the BLIS distribution. It contains a
grammar-like mapping of configuration names, or families, to sub-configurations,
which may be other families. Keep in mind that the / notation:
<config>: <config>/<name>
means that the kernel set associated with <name> should be made available to
the configuration <config> if <config> is targeted at configure-time.
(Some configurations borrow kernels from other configurations, and this is how
we specify that requirement.)