ik_llama.cpp

mirror of https://github.com/ikawrakow/ik_llama.cpp.git synced 2026-02-25 07:34:10 +00:00

Files

Iwan Kawrakow 01e2b0c2ce FA: Add option to build all FA kernels

Similar to the CUDA situation.
It is OFF by default.
If OFF, only F16, Q8_0, Q6_0, and, if the CPU provides native
BF16 support, BF16 FA kernels will be included.
To enable all, cmake -DGGML_IQK_FA_ALL_QUANTS=1 ...
This cuts compilation time for iqk_mul_mat.cpp by almost half
(45 seconds vs 81 seconds on my Ryzen-7950X).

2025-02-09 18:50:50 +02:00

cmake

Merge mainline llama.cpp (#3 )

2024-07-27 07:55:01 +02:00

include

Use Q8_K_128 for IQ1_S_R4 and IQ1_M_R4 matrix multiplications (#194 )

2025-02-09 09:14:52 +02:00

src

FA: Add option to build all FA kernels

2025-02-09 18:50:50 +02:00

.gitignore

Merge mainline llama.cpp (#3 )

2024-07-27 07:55:01 +02:00

CMakeLists.txt

FA: Add option to build all FA kernels

2025-02-09 18:50:50 +02:00