mirror of
https://github.com/amd/blis.git
synced 2026-04-20 07:38:53 +00:00
- Added conditional swapping of input matrices and their strides for GEMV, based on whether transpose is toggled specifically for the matrix, namely the B matrix when m=1 and the A matrix when n=1. - This swapping ensures that we reroute the inputs to use the alternative variant(code-path) in order to avoid packing cost for the matrix, through logical transposition. - Currently, this optimization is enabled only when no post-ops are involved. With post-ops, there is a need to update the incoming data(from the user) in some scenarios, which will be dealt with later. AMD-Internal: [CPUPL-7323] Co-authored-by: Vignesh Balasubramanian <vignbala@amd.com>