Files
ik_llama.cpp/ggml
Kawrakow 1eac9e8487 NEON implementation for trellis quants (#471)
* iq2_kt: NEON implementation

* iq3_kt: NEON implementation

* iq4_kt: not working NEON implementation

* iq4_kt: NEON implementation

Have to use f32 arithmetic else I get gibberish?
Correspondigly ridiculously slow.

* Cleanup

* iq4_kt: slightly faster TG on NEON

---------

Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
2025-05-29 18:57:41 +03:00
..
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00