ik_llama.cpp

mirror of https://github.com/ikawrakow/ik_llama.cpp.git synced 2026-01-26 09:09:50 +00:00

main

30381fc1fc · Faster hybrid inference when shared experts (#1191) · Updated 2026-01-26 05:22:05 +00:00

ik/fused_norm 2de3a96510 · Avoid computing the attention reduce op for cohere2 · Updated 2025-12-24 10:14:58 +00:00	4147 4076	ZIP TAR.GZ
ik/nccl3 172f9dad4c · WIP: fix sm layer (MoE) · Updated 2025-12-21 16:12:03 +00:00	75 9	ZIP TAR.GZ
ik/nccl2 706341d15b · nccl: second attempt, not working · Updated 2025-12-21 05:58:50 +00:00	75 7	ZIP TAR.GZ
ik/nccl1 e28148d401 · WIP · Updated 2025-12-20 06:50:58 +00:00	75 6	ZIP TAR.GZ
ik/p2p_cpy_set_device 64908da772 · cuda: set device to src device before p2p copy · Updated 2025-12-17 11:43:36 +00:00	76 1	ZIP TAR.GZ
ik/better_graph_pp 0864655a72 · Disable split scheduling with tensor overrides · Updated 2025-12-17 06:38:18 +00:00	4147 4076	ZIP TAR.GZ
ik/better_graph_tg 5a731064e6 · Much better TG speed with split mode "graph" · Updated 2025-12-15 13:53:35 +00:00	81 1	ZIP TAR.GZ
ik/ignore_nextn_layers 664a529332 · Use actual active number of layers when preparing splits · Updated 2025-12-14 06:41:41 +00:00	83 1	ZIP TAR.GZ
ik/cohere2_sm_graph f81c0b7fa0 · WIP · Updated 2025-12-13 17:43:17 +00:00	4147 4071	ZIP TAR.GZ
ik/fix_sync_logic d82ed383ce · Fix sync logic · Updated 2025-12-13 17:39:42 +00:00	4147 4063	ZIP TAR.GZ
ik/undo_sync_reduction 72af525c9f · Undo sync reduction · Updated 2025-12-13 15:57:07 +00:00	4147 4062	ZIP TAR.GZ
ik/undo_1049_if_tensor_overrides 082545b3f0 · Do not use split mode graph scheduling if there are tensor overrides · Updated 2025-12-12 13:36:02 +00:00	4147 4061	ZIP TAR.GZ
ik/fix_mmq_overflow 50fbde85dc · Fix overflow in offset calculation in mmq · Updated 2025-12-12 13:22:02 +00:00	4147 4060	ZIP TAR.GZ
ik/sm_graph_rearrange 643cccd2c8 · This is better · Updated 2025-12-12 06:23:39 +00:00	4147 4060	ZIP TAR.GZ
ik/disable_or_enable_p2p ca1e7070f6 · Be able to enable or disable P2P via command line argument · Updated 2025-12-11 17:46:54 +00:00	4147 4058	ZIP TAR.GZ
ik/fix_1055 e094f32467 · Fix #1055 · Updated 2025-12-11 13:26:41 +00:00	4147 4057	ZIP TAR.GZ
ik/fix_the_fix b41b17943d · Fix the fix · Updated 2025-12-11 07:03:52 +00:00	4147 4054	ZIP TAR.GZ
ik/sm_graph_max_gpu c953b47266 · Be able to set a max. number of GPUs to be used in split mode graph · Updated 2025-12-11 06:21:42 +00:00	4147 4054	ZIP TAR.GZ
ik/fix_bench_compile b37fafdc39 · Fix llama-bench - missing buffer override comparison operator · Updated 2025-12-11 06:18:45 +00:00	4147 4053	ZIP TAR.GZ
ik/sm_graph_sync b0cc63bcdf · Another attempt for sm graph · Updated 2025-12-09 19:30:06 +00:00	97 3	ZIP TAR.GZ

... 2 3 4 5 6 ...

Default Branch

Branches