Use bf16 instead of fp16 block scales for q8_1 (#292)

* WIP - not working

* q8_0 without bells and wistles works

* It works for q8_0

* Use bf16 instead of f16,int16

* q4_0_r8

* q5_0_r4

* q6_0_r4

* Also q4_1 and q5_1

* q8_0_r8 on avx2

---------

Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
This commit is contained in:
Kawrakow
2025-03-27 05:49:16 +01:00
committed by GitHub
parent b307c1c375
commit d71e84bdc1
6 changed files with 348 additions and 255 deletions

View File

@@ -396,8 +396,9 @@ extern "C" {
//
GGML_TYPE_I2_S = 36,
//
GGML_TYPE_Q8_0_X4 = 98,
GGML_TYPE_Q8_1_X4 = 99,
GGML_TYPE_Q8_0_X4 = 97,
GGML_TYPE_Q8_1_X4 = 98,
GGML_TYPE_Q8_2_X4 = 99,
GGML_TYPE_Q6_0 = 133,
GGML_TYPE_IQ1_BN = 134,
GGML_TYPE_IQ2_BN = 135,