ik_llama.cpp

mirror of https://github.com/ikawrakow/ik_llama.cpp.git synced 2026-02-26 08:04:09 +00:00

Files

Iwan Kawrakow 8f0d075f5e iq3_kt WIP: slowly improving

PPL(LLaMA-3.1-8B-Instruct, 8192) is now 6.7689 after shrinking
by 0.015 bpw by using iq4_k instead of q5_k for attn_v.

2024-11-21 08:16:41 +02:00

CMakeLists.txt

2024-07-27 07:55:01 +02:00

llama-grammar.cpp

2024-08-12 15:14:32 +02:00

llama-grammar.h

2024-07-27 07:55:01 +02:00

llama-impl.h

2024-09-28 17:59:47 +03:00

llama-sampling.cpp

2024-07-27 07:55:01 +02:00

llama-sampling.h

2024-07-27 07:55:01 +02:00

llama-vocab.cpp

2024-08-12 15:14:32 +02:00

llama-vocab.h

2024-08-12 15:14:32 +02:00

llama.cpp

2024-11-21 08:16:41 +02:00

unicode-data.cpp

2024-07-27 07:55:01 +02:00

unicode-data.h

2024-07-27 07:55:01 +02:00

unicode.cpp

2024-07-27 07:55:01 +02:00

unicode.h

2024-07-27 07:55:01 +02:00