mirror of
https://github.com/amd/blis.git
synced 2026-05-27 07:54:19 +00:00
1. Added new kernel bli_dnorm2fv_unb_var1 kernel to compute
norm with dot operation.
2. Added vectorization to compute square of 32 double element
block size from vector X.
3. Defined a new Macro BLIS_ENABLE_DNRM2_FAST under config header
to compute nrm2 using new kernel.
4. Dot kernel definitions and implementation have a possibility for
accuracy issues .we can switch to traditional implementation by
disabling the MACRO BLIS_ENABLE_DNRM2_FAST to compute L2-norm
for Vector X .
AMD-Internal: [CPUPL-1757]
Change-Id: I1adcaf1b3b4e33837758593c998c25705ff0fe11
For more information on sub-configurations and configuration families in BLIS, please read the Configuration Guide, which can be viewed in markdown-rendered form from the BLIS wiki page.
If you don't have time, or are impatient, take a look at the config_registry
file in the top-level directory of the BLIS distribution. It contains a
grammar-like mapping of configuration names, or families, to sub-configurations,
which may be other families. Keep in mind that the / notation:
<config>: <config>/<name>
means that the kernel set associated with <name> should be made available to
the configuration <config> if <config> is targeted at configure-time.
(Some configurations borrow kernels from other configurations, and this is how
we specify that requirement.)