mirror of
https://github.com/amd/blis.git
synced 2026-04-20 15:48:50 +00:00
Implemented and registered power9 dgemm ukernel. Details: - Implemented 12x6 dgemm microkernel for power9. This microkernel assumes that elements of B have been duplicated/broadcast during the packing step. The microkernel uses a column orientation for its microtile vector registers and thus implements column storage and general stride IO cases. (A row storage IO case via in-register transposition may be added at a future date.) It should be noted that we recommend using this microkernel with gcc and *not* xlc, as issues with the latter cropped up during development, including but not limited to slightly incompatible vector register mnemonics in the GNU extended inline assembly clobber list.
47 lines
1.2 KiB
Plaintext
47 lines
1.2 KiB
Plaintext
#
|
|
# config_registry
|
|
#
|
|
# Please refer to the BLIS wiki on configurations for information on the
|
|
# syntax and semantics of this file [1].
|
|
#
|
|
# [1] https://github.com/flame/blis/wiki/ConfigurationHowTo
|
|
#
|
|
|
|
# Processor families.
|
|
x86_64: intel64 amd64
|
|
intel64: skx knl haswell sandybridge penryn generic
|
|
amd64: zen2 zen excavator steamroller piledriver bulldozer generic
|
|
# NOTE: ARM families will remain disabled until runtime hardware detection
|
|
# logic is added to BLIS.
|
|
#arm64: cortexa57 generic
|
|
#arm32: cortexa15 cortexa9 generic
|
|
|
|
# Intel architectures.
|
|
skx: skx/skx/haswell/zen
|
|
knl: knl/knl/haswell/zen
|
|
haswell: haswell/haswell/zen
|
|
sandybridge: sandybridge
|
|
penryn: penryn
|
|
|
|
# AMD architectures.
|
|
zen2: zen2/zen2/zen/haswell
|
|
zen: zen/zen/haswell
|
|
excavator: excavator/piledriver
|
|
steamroller: steamroller/piledriver
|
|
piledriver: piledriver
|
|
bulldozer: bulldozer
|
|
|
|
# ARM architectures.
|
|
thunderx2: thunderx2/armv8a
|
|
cortexa57: cortexa57/armv8a
|
|
cortexa53: cortexa53/armv8a
|
|
cortexa15: cortexa15/armv7a
|
|
cortexa9: cortexa9/armv7a
|
|
|
|
# IBM architectures.
|
|
power9: power9
|
|
bgq: bgq
|
|
|
|
# Generic architectures.
|
|
generic: generic
|