Files
ik_llama.cpp/ggml
Kawrakow 7e79665a31 CUDA implementation for IQ1_S_R4 (#492)
* iq1_s_r4: CUDA dequantize

* iq1_s_r4: CUDA GEMV

* iq1_s_r4: MMQ on CUDA

Requires Turing or better (will fall back to dequantize+cuBLAS on older cards).

---------

Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
2025-06-05 07:24:31 +03:00
..
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00