IQ5_KS_R4: row-interleaved IQ5_KS (#426)

* iq5_ks_r4: basics

* iq5_ks_r4: Zen4 works

* iq5_ks_r4: AVX2 works

* iq5_ks_r4: NEON

* Fix iq5_ks on NEON

---------

Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
This commit is contained in:
Kawrakow
2025-05-17 08:57:26 +03:00
committed by GitHub
parent e31ba05fcd
commit db111c91ee
10 changed files with 441 additions and 51 deletions

View File

@@ -220,6 +220,7 @@ extern "C" {
LLAMA_FTYPE_MOSTLY_IQ4_K_R4 = 340, // except 1d tensors
LLAMA_FTYPE_MOSTLY_IQ5_K_R4 = 341, // except 1d tensors
LLAMA_FTYPE_MOSTLY_IQ4_KS_R4 = 345, // except 1d tensors
LLAMA_FTYPE_MOSTLY_IQ5_KS_R4 = 350, // except 1d tensors
LLAMA_FTYPE_MOSTLY_Q8_KV_R8 = 398, // except 1d tensors
LLAMA_FTYPE_MOSTLY_Q8_K_R8 = 399, // except 1d tensors