Files
ik_llama.cpp/ggml/include
Kawrakow d0b52076da Use bf16 instead of fp16 block scales for q8_1 (#292)
* WIP - not working

* q8_0 without bells and wistles works

* It works for q8_0

* Use bf16 instead of f16,int16

* q4_0_r8

* q5_0_r4

* q6_0_r4

* Also q4_1 and q5_1

* q8_0_r8 on avx2

---------

Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
2025-03-27 05:49:16 +01:00
..
2024-07-27 07:55:01 +02:00
2024-10-25 13:08:43 +02:00
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00
2024-08-12 15:14:32 +02:00
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00