Files
ik_llama.cpp/ggml
Kawrakow 636d97fefa Faster CPU prompt processing for Trellis quants and MoE models (#488)
* Also do the dequantize approach for mul_mat_id

* Also do the dequantize approach for iqk_moe_fused_up_gate

---------

Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
2025-06-05 08:30:35 +03:00
..
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00