mirror of
https://github.com/amd/blis.git
synced 2026-05-24 18:34:40 +00:00
config => config/build/arch folder
Issue:
1. Performance drop is observed as part of the fat binary(amdzen config)
built to support all the platforms using dynamic dispatch feature.
2. Observed only in intrinsic code and not in assembly code.
3. Observed in many of level1 kernels on Milan and Genoa
Previous Design:
Znver flags are picked based on config or function name
In case of ref_kernels:
Compiler picks up znver flag based on the function name. All
ref_kernels are named based on BLIS_CNAME which is a
config name (zen, zen2, zen3, zen4, zen5)
In case of Zen kernels:
Compiler picks up znver flag based on the config name where the
source file exists. All avx2 kernels are placed in zen and all avx512
kernels are placed in zen4/zen5 folder.
Kernels placed in zen (AVX2 kernels) are being compiled with znver1
flag rather than using znver2/znver3 flags on zen2/zen3 arch
respectively
New Design: For amdzen builds
1. For ref_kernels and kernels/(zen/zen2/zen3), znver2 flag is used instead of
znver1 in make and cmake build system.
2. To use znver2 flags, make_defs.mk of zen2 is included in zen config
3. No changes are made for auto or any individual config
4. Significant perfomance improvement is observed
AMD-Internal : [CPUPL-5407] [CPUPL-5406] [CPUPL-4873] [CPUPL-4872] [CPUPL-4871] [CPUPL-4801] [CPUPL-4800] [CPUPL-4799]
Change-Id: Ie817c13b8b69a2dc4328aad7ae09a3af06f83df5
For more information on sub-configurations and configuration families in BLIS, please read the Configuration Guide, which can be viewed in markdown-rendered form from the BLIS wiki page.
If you don't have time, or are impatient, take a look at the config_registry
file in the top-level directory of the BLIS distribution. It contains a
grammar-like mapping of configuration names, or families, to sub-configurations,
which may be other families. Keep in mind that the / notation:
<config>: <config>/<name>
means that the kernel set associated with <name> should be made available to
the configuration <config> if <config> is targeted at configure-time.
(Some configurations borrow kernels from other configurations, and this is how
we specify that requirement.)