ik_llama.cpp

mirror of https://github.com/ikawrakow/ik_llama.cpp.git synced 2026-01-26 09:09:50 +00:00

main

30381fc1fc · Faster hybrid inference when shared experts (#1191) · Updated 2026-01-26 05:22:05 +00:00

ik/fix_add_bf16_turing 5c1c0e2bad · Prevent using NCCL if graph reduce type is bf16 and arch < AMPERE · Updated 2026-01-19 09:25:20 +00:00	21 2		ZIP TAR.GZ
ik/skip_get_rows ae5c269371 · More models · Updated 2026-01-18 13:37:40 +00:00	22 4		ZIP TAR.GZ
ik/reduce_make_copies fb5c340e17 · Copy reduce result to other GPUs if necessary · Updated 2026-01-18 07:00:06 +00:00	22 1		ZIP TAR.GZ
ik/extra_reduce_types 73b8fea90b · This finally works · Updated 2026-01-17 17:25:57 +00:00	25 2		ZIP TAR.GZ
fcp/server_refactor_rename 02aa65009b · fix test build error · Updated 2026-01-17 16:04:42 +00:00	27 5		ZIP TAR.GZ
fcp/webui_text_compl c2eed98296 · update description · Updated 2026-01-17 00:52:52 +00:00	27 2		ZIP TAR.GZ
ik/try_fix_many_gpus_2 c6c890e164 · WIP - still deadlocking · Updated 2026-01-16 15:07:23 +00:00	27 5		ZIP TAR.GZ
ik/try_fix_many_gpus 4730b3e1f0 · printf cleanup · Updated 2026-01-15 14:33:54 +00:00	27 4		ZIP TAR.GZ
ik/fix_exp_shexp_split e65782de67 · Fix experts/shared experts split · Updated 2026-01-14 13:26:09 +00:00	28 1		ZIP TAR.GZ
ik/llama_bench_overrides 4fd797c863 · Make adding tensor overrides to llama-bench table optional · Updated 2026-01-13 08:55:38 +00:00	31 1		ZIP TAR.GZ
ik/llama_bench_sas 81c466835d · Add -sas, --scheduler-async to llama-bench · Updated 2026-01-13 08:21:44 +00:00	32 1		ZIP TAR.GZ
ik/merge_up_gate_exps_3 a50bd821ec · Also Qwen3VL-MoE · Updated 2026-01-12 16:52:15 +00:00	38 4		ZIP TAR.GZ
ik/merge_up_gate_exps_2 5d0123313a · All the others · Updated 2026-01-12 16:22:53 +00:00	40 16		ZIP TAR.GZ
ik/fuse_merge_up_gate_exps 905bca2e1c · Cleanup · Updated 2026-01-12 13:28:06 +00:00	40 13		ZIP TAR.GZ
ik/try_authors 738dc60b78 · We don't need these · Updated 2026-01-10 15:32:21 +00:00	40 0	Included	ZIP TAR.GZ
ik/bailingmoe2_graph 1ee36144a8 · WIP - something is wrong · Updated 2026-01-10 13:17:22 +00:00	41 1		ZIP TAR.GZ
ik/deepseek_mla0 d329029dde · Fix mla = 0 · Updated 2026-01-10 08:27:57 +00:00	42 1		ZIP TAR.GZ
ik/update_authors 39e57c1b57 · Update AUTHORS · Updated 2026-01-10 06:09:34 +00:00	4147 4105		ZIP TAR.GZ
ik/fix_gpt_oss_partial_offload 58f3784821 · Fix split mode graph for GPT-OSS with partial offload · Updated 2026-01-09 16:57:30 +00:00	4147 4102		ZIP TAR.GZ
ik/graph_better_splits ae547b8502 · Fix assert when --max-gpu is less than available GPUs · Updated 2026-01-09 11:15:05 +00:00	4147 4102		ZIP TAR.GZ

1 2 3 4 5 ...

Default Branch

Branches