Files
blis/config
RuQing Xu dfa5413966 Arm64 dgemmsup with extended MR&NR (#655)
Details:
- Since the number of registers in NEON is large but their lengths are 
  short, I'm here extending both MR and NR.
- The approach is to represent the C microtile in registers optionally 
  in columns, so for sizes like 6x7m, the 'crr' kernel is the default 
  with 'rrr' supported through an in-register transpose.
- A few asm kernels are crafted for 'rv' to complete this extended size 
  support.
- For 'rd' I'm still relying heavily on C99 intrinsic kernels with 
  branching so the performance might not be optimal. (Sorry for that.)
- So far, these changes only affect the 'firestorm' subconfig.
- This commit also contains row-preferential s12x8 and d6x8 gemm
  ukernels. These microkernels are templatized versions of the existing
  s8x12 and d6x8 ukernels defined in bli_gemm_armv8a_asm_d6x8.c.
2022-08-29 19:07:50 -05:00
..

For more information on sub-configurations and configuration families in BLIS, please read the Configuration Guide, which can be viewed in markdown-rendered form from the BLIS wiki page.

If you don't have time, or are impatient, take a look at the config_registry file in the top-level directory of the BLIS distribution. It contains a grammar-like mapping of configuration names, or families, to sub-configurations, which may be other families. Keep in mind that the / notation:

<config>: <config>/<name>

means that the kernel set associated with <name> should be made available to the configuration <config> if <config> is targeted at configure-time. (Some configurations borrow kernels from other configurations, and this is how we specify that requirement.)