Files
ik_llama.cpp/ggml
Kawrakow 55a704b67a Fused Q and K fused_rms_norm for TG on CUDA (#882)
* Biased mmvq: minor optimization

* Fusing Q and K rms_norm for TG on CUDA

* Remove commented out code

---------

Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
2025-10-31 14:41:28 +02:00
..
2024-07-27 07:55:01 +02:00
2025-10-24 07:40:35 +03:00
2024-07-27 07:55:01 +02:00