mirror of
https://github.com/amd/blis.git
synced 2026-04-20 15:48:50 +00:00
Details:
- Changed the implementation in the 'gemmlike' sandbox to more easily
allow others to provide custom implementations of packm. These changes
include:
- Calling a local version of packm_cxk() that can be modified. This
version of packm_cxk() uses inlined loops in packm_cxk() rather
than querying the context for packm kernels (or even using scal2m).
- Providing two variants of packm, one of which calls the
aforementioned packm_cxk(), the other of which inlines the contents
of packm_cxk() into the variant itself, making it self-contained.
To switch from one to the other, simply change which function gets
called within bls_packm_a() and bls_packm_b().
- Simplified and cleaned up some variant names in both variants of
packm, relative to their parent code.