Files
ik_llama.cpp/ggml/src/ggml-cuda
Kawrakow 6f1a69352f Fuse experts bias in top_k_moe kernel (#1170)
* GLM-4.7-Flash support

* Model type

* Make FA work for mla != 0

* Fuse bias in top_k_moe kernel if present
2026-01-20 15:38:51 +02:00
..
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00
2025-10-22 16:18:11 +03:00
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00
2026-01-07 08:01:31 +02:00
2025-04-07 10:43:26 +02:00
2024-07-27 07:55:01 +02:00
2025-08-09 08:40:18 +03:00
2025-05-12 07:49:00 +03:00
2025-12-13 20:30:08 +01:00
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00
2024-08-12 15:14:32 +02:00
2025-11-24 06:55:14 +01:00
2025-09-23 16:43:02 +02:00
2026-01-20 12:46:52 +02:00
2026-01-20 12:46:52 +02:00
2025-04-07 10:43:26 +02:00
2024-07-27 07:55:01 +02:00
2025-09-27 11:15:32 +02:00
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00
2025-11-18 08:55:36 +00:00
2025-11-09 11:34:33 +02:00
2025-11-09 11:34:33 +02:00
2025-10-27 16:09:01 +02:00
2025-10-24 07:40:35 +03:00
2025-10-24 07:40:35 +03:00
2025-12-24 15:22:43 +01:00
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00
2025-11-09 11:34:33 +02:00
2025-11-09 11:34:33 +02:00
2025-11-19 09:08:42 +01:00
2025-11-19 09:08:42 +01:00
2024-07-27 07:55:01 +02:00
2025-04-07 10:43:26 +02:00
2025-04-07 10:43:26 +02:00
2025-04-07 10:43:26 +02:00
2025-10-22 16:18:11 +03:00
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00