Add packing support M edge cases in ZGEMM 12xk pack kernel (#89)

Previously, the ZGEMM implementation used `zscalv` for cases
    where the M dimension of matrix A is not in multiple of 24,
    resulting in a ~40% performance drop.

    This commit introduces a specialized edge cases in pack kernel
    to optimize performance for these cases.

    The new packing support significantly improves the performance.

    - Removed reliance on `zscalv` for edge cases, addressing the
      performance bottleneck.

    AMD-Internal: [CPUPL-6677]

Co-authored-by: harsh dave <harsdave@amd.com>
This commit is contained in:
Dave, Harsh
2025-08-14 14:29:03 +05:30
committed by GitHub
parent 76c4872718
commit 1b1b19486b

File diff suppressed because it is too large Load Diff