Details:
- Updated docs/Testsuite.md to include mention of the new mixed-domain
and mixed-precision settings, including descriptions.
- Updated docs/MixedDatatypes.md to include a brief section on running
the testsuite to exercise mixed-datatype functionality, which mostly
amounts to a link to the Testsuite.md document.
- Minor verbiage change to testsuite output to correct a misleading
label associated with the value returned by the query function
bli_info_get_simd_num_registers(). (The function does not return the
number of SIMD registers present in the hardware, but rather a maximum
assumed value for the purposes of allocating temporary microtile
workspace on the function stack.)
Details:
- Print an error message from configure if the user attempts to
explicitly configure BLIS for simultaneous use of 64-bit integers in
the BLAS API with 32-bit integers in the BLIS API.
- Added cpp macro conditional to bli_type_defs.h to mandate that BLIS
integers be 64 bits if the BLAS integers are 64 bits. This and the
above item take care of issue #274. Thanks to Devin Matthews and
Jeff Hammond for suggesting these safeguards.
- Slight reorganization and relabeling (for clarity) of BLAS/CBLAS
sections and BLIS integer size line of the testsuite configuration
output.
- Very minor edits to docs/MixedDatatypes.md.
Details:
- Implemented support for gemm where A, B, and C may have different
storage datatypes, as well as a computational precision (and implied
computation domain) that may be different from the storage precision
of either A or B. This results in 128 different combinations, all
which are implemented within this commit. (For now, the mixed-datatype
functionality is only supported via the object API.) If desired, the
mixed-datatype support may be disabled at configure-time.
- Added a memory-intensive optimization to certain mixed-datatype cases
that requires a single m-by-n matrix be allocated (temporarily) per
call to gemm. This optimization aims to avoid the overhead involved in
repeatedly updating C with general stride, or updating C after a
typecast from the computation precision. This memory optimization may
be disabled at configure-time (provided that the mixed-datatype
support is enabled in the first place).
- Added support for testing mixed-datatype combinations to testsuite.
The user may test gemm with mixed domains, precisions, both, or
neither.
- Added a standalone test driver directory for building and running
mixed-datatype performance experiments.
- Defined a new variation of castm, castnzm, which operates like castm
except that imaginary values are not touched when casting a real
operand to a complex operand. (By contrast, in these situations castm
sets the imaginary components of the destination matrix to zero.)
- Defined bli_obj_imag_is_zero() and substituted calls in lieu of all
usages of bli_obj_imag_equals() that tested against BLIS_ZERO, and
also simplified the implementation of bli_obj_imag_equals().
- Fixed bad behavior from bli_obj_is_real() and bli_obj_is_complex()
when given BLIS_CONSTANT objects.
- Disabled dt_on_output field in auxinfo_t structure as well as all
accessor functions. Also commented out all usage of accessor
functions within macrokernels. (Typecasting in the microkernel is
still feasible, though probably unrealistic for now given the
additional complexity required.)
- Use void function pointer type (instead of void*) for storing function
pointers in bli_l0_fpa.c.
- Added documentation for using gemm with mixed datatypes in
docs/MixedDatatypes.md and example code in examples/oapi/11gemm_md.c.
- Defined level-1d operation xpbyd and level-1m operation xpbym.
- Added xpbym test module to testsuite.
- Updated frame/include/bli_x86_asm_macros.h with additional macros
(courtsey of Devin Matthews).