ik_llama.cpp

mirror of https://github.com/ikawrakow/ik_llama.cpp.git synced 2026-02-25 15:44:10 +00:00

Files

Iwan Kawrakow 422e5768e4 Adding iq4_nl_x4

Looks very promising - I get PP-512(LLaMA-3.1-8B) = 230 t/s
on the Ryzen-7950X! This is faster than any other quant and
~40% faster than iq4_nl.

2024-11-30 09:21:04 +02:00

llama.h

Adding iq4_nl_x4

2024-11-30 09:21:04 +02:00