Files
ik_llama.cpp/ggml
Kawrakow 9f3d062ba7 CUDA: faster prompt processing for 4-bit quants (#713)
* Use __byte_perm in get_int_from_table_16

* Use get_int_from_table_16 everywhere for 4-bit quants

---------

Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
2025-08-21 15:57:35 +03:00
..
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00