mirror of
https://github.com/amd/blis.git
synced 2026-05-11 17:50:00 +00:00
Details: - Added a num_t datatype bitfield to the obj_t in the form of a new info2 field in the obj_t. This change was made primarily so that in the case of mixed-datatype gemm, the alpha scalar would not need to be cast to the storage datatype of B (or A) before then being cast to the computation datatype just before the macrokernel is called. This double-casting regime could result in loss of precision if the storage datatype of B (or A) is less than the computation precision. In practice, it was likely not going to be a big deal since most usage of alpha is for -1.0, 0.0, and 1.0 (or integer multiples thereof), which can all be represented exactly in single or double precision. - The type of objbits_t was changed to uint32_t, so the new format potentially takes up the same space as the previous obj_t definition, assuming no padding inserted by the compiler. Shrinking info to 32 bits and spilling over into a second field was chosen over using the high 32 bits of a single 64-bit objbits_t info field because many of the bitwise operations are performed with enums such as num_t, dom_t, and prec_t, which may take on the type of 32-bit ints. It's easier to just keep all of those bitwise operations in 32 bits than perform a million typecasts throughout bli_type_defs.h and bli_obj_macro_defs.h to ensure that the integers are treated as 64-bit for the purposes of the ANDs, ORs, and bitshifts. - Many comment updates. - Thanks to Devin Matthews and Devangi Parikh for their feedback and involvement during this commit cycle.