ik_llama.cpp

mirror of https://github.com/ikawrakow/ik_llama.cpp.git synced 2026-02-25 23:54:10 +00:00

Files

Iwan Kawrakow 2e9b3ba92b Adapting iq2_bn to work without separate scale tensors

Why? It is becoming burdensome to maintain the special Bitnet
conversion in convert_hf_to_gguf.py, so I thnk it is better
to make iq1_bn and iq2_bn just work with the mainline
conversion script (which does not generate scales).

2024-10-24 12:48:53 +02:00

cmake

Merge mainline llama.cpp (#3 )

2024-07-27 07:55:01 +02:00

include

Avoid rebuild of GGML graph for each token (#98 )

2024-10-20 08:36:16 +02:00

src

Adapting iq2_bn to work without separate scale tensors

2024-10-24 12:48:53 +02:00

.gitignore

Merge mainline llama.cpp (#3 )

2024-07-27 07:55:01 +02:00

CMakeLists.txt

Move to c++17 projectwide (#80 )

2024-10-04 14:43:26 +03:00