Files
blis/config_registry
Field G. Van Zee 3c52725693 Renamed/moved l3 zen ukernels to haswell kernel set.
Details:
- Renamed the microkernels in kernels/zen/3 to kernels/haswell/3 and
  then updated the file contents to use the 'haswell' infix.
- Updated bli_cntx_init_zen.c and bli_cntx_init_haswell.c according to
  above function renames.
- Moved/updated the corresponding prototypes in bli_kernels_zen.h to
  bli_kernels_haswell.h.
- Updated config_registry according to above changes.
- NOTE: This rename reflects the fact that haswell microkernels are
  specifically written to overcome the floating-point latency for FMA
  instructions on Intel Haswell-like architectures, which can issue two
  FMA instructions per cycle. These ukernels happen to work fine on AMD
  Zen-based architectures. However, Zen only issues one FMA per cycle,
  which, while halving its floating-point throughput, gives it extra
  flexibility in the design of its microkernels--namely, mr and nr can
  be smaller and still overcome the floating-point latency for those
  single-issue cores. A smaller value of mr and nr allows for a larger
  value of kc, which may be useful in some situations. In the future,
  we may write such Zen-specific microkernels to take advantage of this
  additional flexibility.
2018-10-17 14:56:22 -05:00

42 lines
1008 B
Plaintext

#
# config_registry
#
# Please refer to the BLIS wiki on configurations for information on the
# syntax and semantics of this file [1].
#
# [1] https://github.com/flame/blis/wiki/ConfigurationHowTo
#
# Processor families.
x86_64: intel64 amd64
intel64: skx knl haswell sandybridge penryn generic
amd64: zen excavator steamroller piledriver bulldozer generic
arm64: cortexa57 generic
arm32: cortexa15 cortexa9 generic
# Intel architectures.
skx: skx/skx/haswell/zen
knl: knl/knl/haswell/zen
haswell: haswell/haswell/zen
sandybridge: sandybridge
penryn: penryn
# AMD architectures.
zen: zen/zen/haswell
excavator: excavator/piledriver
steamroller: steamroller/piledriver
piledriver: piledriver
bulldozer: bulldozer
# ARM architectures.
cortexa57: cortexa57/armv8a
cortexa53: cortexa53/armv8a
cortexa15: cortexa15/armv7a
cortexa9: cortexa9/armv7a
# IBM architectures.
bgq: bgq
# Generic architectures.
generic: generic