mirror of
https://github.com/amd/blis.git
synced 2026-05-11 09:39:59 +00:00
Details: - Consolidated most of the code relating to induced complex methods (e.g. 4mh, 4m1, 3mh, 3m1, etc.) into frame/ind. Induced methods are now enabled on a per-operation basis. The current "available" (enabled and implemented) implementation can then be queried on an operation basis. Micro-kernel func_t objects as well as blksz_t objects can also be queried in a similar maner. - Redefined several micro-kernel and operation-related functions in bli_info_*() API, in accordance with above changes. - Added mr and nr fields to blksz_t object, which point to the mr and nr blksz_t objects for each cache blocksize (and are NULL for register blocksizes). Renamed the sub-blocksize field "sub" to "mult" since it is really expressing a blocksize multiple. - Updated bli_*_determine_kc_[fb]() for gemm/hemm/symm, trmm, and trsm to correctly query mr and nr (for purposes of nudging kc). - Introduced an enumerated opid_t in bli_type_defs.h that uniquely identifies an operation. For now, only level-3 id values are defined, along with a generic, catch-all BLIS_NOID value. - Reworked testsuite so that all induced methods that are enabled are tested (one at a time) rather than only testing the first available method. - Reformated summary at the beginning of testsuite output so that blocksize and micro-kernel info is shown for each induced method that was requested (as well as native execution). - Reduced the number of columns needed to display non-matlab testsuite output (from approx. 90 to 80).