blis/addon at c81408c80519327f52f48b29c63628d9ba9c5240 - blis

amd/blis

mirror of https://github.com/amd/blis.git synced 2026-04-20 07:38:53 +00:00

Files

Vankadari, Meghana c81408c805 Modified reorder and pack code in sym quant API (#59 )

Details:
- In s8 APIs with symmetric quantization, Existing kernels are
  reused to avoid duplication of reorder code.
- Since the existing kernels are designed assuming that entire
  KCxNC block is packed at once, to handle grouping in symmetric
  quantization, we have to add JR and group loop outside the
  function call to existing packB function.
- Though this was being done before, the cases where n_rem < 64
  was not handled properly.
- Modified reorder and pack code to first divide the n_fringe part
  into multiples-of-16 part and n_lt_16 part and then calling the
  pack kernel twice to handle both parts separately.
- All the strides to access the reordered/pack buffer are updated
  accordingly.

2025-06-24 11:36:35 +05:30

aocl_gemm

Modified reorder and pack code in sym quant API (#59 )

2025-06-24 11:36:35 +05:30

gemmd

Code cleanup: Copyright notices

2024-08-05 15:35:08 -04:00

CMakeLists.txt

Code cleanup: Copyright notices

2024-08-05 15:35:08 -04:00