ik_llama.cpp/ggml at 3132dd368f44c39b1f98dcb998e9ef0c64c7392f - ik_llama.cpp - Public git mirror

ikawrakow/ik_llama.cpp

mirror of https://github.com/ikawrakow/ik_llama.cpp.git synced 2026-04-21 15:09:40 +00:00

Files

History

Iwan Kawrakow 3132dd368f cuda: fused top_k+softmax as used in most MoE models

2025-09-23 13:59:44 +03:00

..

Merge mainline llama.cpp (#3 )

2024-07-27 07:55:01 +02:00

Offload only activated experts to the GPU (#698 )

2025-09-04 12:22:30 +02:00

cuda: fused top_k+softmax as used in most MoE models

2025-09-23 13:59:44 +03:00

.gitignore

Merge mainline llama.cpp (#3 )

2024-07-27 07:55:01 +02:00

CMakeLists.txt

Set default value of GGML_SCHED_MAX_COPIES to 1 (#751 )

2025-09-02 07:04:39 +02:00