ik_llama.cpp

mirror of https://github.com/ikawrakow/ik_llama.cpp.git synced 2026-02-02 20:48:03 +00:00

Files

Kawrakow d0b52076da Use bf16 instead of fp16 block scales for q8_1 (#292 )

* WIP - not working

* q8_0 without bells and wistles works

* It works for q8_0

* Use bf16 instead of f16,int16

* q4_0_r8

* q5_0_r4

* q6_0_r4

* Also q4_1 and q5_1

* q8_0_r8 on avx2

---------

Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>

2025-03-27 05:49:16 +01:00

ggml-alloc.h

Merge mainline llama.cpp (#3 )

2024-07-27 07:55:01 +02:00

ggml-backend.h

Bitnet changes (#106 )

2024-10-25 13:08:43 +02:00

ggml-blas.h

Merge mainline llama.cpp (#3 )

2024-07-27 07:55:01 +02:00

ggml-cann.h

Merge mainline llama.cpp (#3 )