mirror of
https://github.com/amd/blis.git
synced 2026-05-11 17:50:00 +00:00
Details: - Added two more arguments to the gemm and gemmtrsm microkernels: the addresses of the next micro-panels of A and B. By passing these pointers into the micro-kernel, we allow the micro-kernel author to prefetch micro-panels of A and B as necessary (though this is completely optional; these addresses may also be safely ignored). - Updated all seven macro-kernels so that they compute and pass in a_next and b_next. Note that ONLY the gemm macro-kernel computes a_next and b_next with the precise semantics we want. I will go back and fix the other macro-kernels in the near future. - Added 'restrict' to various micro-kernels from which it was missing.