* iq3_s_r4: WIP

* iq3_s_r4: Zen4

* iq3_s_r4: slightly better Zen4

* iq3_s_r4: AVX2

* iq3_s_r4: NEON

* iq3_s_r4: rearrange quants

* iq3_s_r4: rearranged quants - AVX2

* iq3_s_r4: rearranged quants - NEON

---------

Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
This commit is contained in:
Kawrakow
2024-12-23 14:34:23 +01:00
committed by GitHub
parent aa2595415a
commit da3bfd1009
10 changed files with 394 additions and 47 deletions

View File

@@ -192,6 +192,7 @@ extern "C" {
LLAMA_FTYPE_MOSTLY_IQ2_XS_R4 = 220, // except 1d tensors
LLAMA_FTYPE_MOSTLY_IQ3_XXS_R4 = 223, // except 1d tensors
LLAMA_FTYPE_MOSTLY_IQ4_NL_R4 = 225, // except 1d tensors
LLAMA_FTYPE_MOSTLY_IQ3_S_R4 = 226, // except 1d tensors
LLAMA_FTYPE_MOSTLY_IQ2_M_R4 = 229, // except 1d tensors
LLAMA_FTYPE_MOSTLY_IQ4_XS_R4 = 230, // except 1d tensors
LLAMA_FTYPE_MOSTLY_Q6_0_R4 = 335, // except 1d tensors