ik_llama.cpp

mirror of https://github.com/ikawrakow/ik_llama.cpp.git synced 2026-01-26 09:09:50 +00:00

main

30381fc1fc · Faster hybrid inference when shared experts (#1191) · Updated 2026-01-26 05:22:05 +00:00

ik/try_split_offloaded_moe_up_gate a2f5614529 · Try to split offloaded MoE up/gate up · Updated 2025-12-09 10:09:04 +00:00	97 3		ZIP TAR.GZ
ik/backend_reduce_syncs ccf72a0e46 · Also this · Updated 2025-12-09 06:36:31 +00:00	97 2		ZIP TAR.GZ
ik/split_graph_2 c83d2fd335 · WIP · Updated 2025-12-08 15:44:53 +00:00	99 3		ZIP TAR.GZ
ik/handle_split_cache be8e7057b3 · Handle split cache (read) · Updated 2025-12-08 08:55:35 +00:00	98 2		ZIP TAR.GZ
ik/fix_annoying_warnings 0e683f24ad · Fix annoying compiler warnings · Updated 2025-12-06 08:57:50 +00:00	100 1		ZIP TAR.GZ
ik/sm_graph_disable_cuda_graphs a4da6e298a · Automatically disable CUDA graphs for split mode "graph" · Updated 2025-12-05 17:00:58 +00:00	101 1		ZIP TAR.GZ
ik/cuda_set_device b18f658a7d · CUDA: set current device in compute_forward · Updated 2025-12-05 15:40:48 +00:00	103 1		ZIP TAR.GZ
ik/dont_split_output ed8a3d8e3d · Don't split the output tensor · Updated 2025-12-05 13:16:11 +00:00	104 1		ZIP TAR.GZ
ik/fix_debug_build 9264abfbaf · Fix debug build (#1037) · Updated 2025-12-05 13:06:22 +00:00	104 0	Included	ZIP TAR.GZ
ik/mistral3_large c374b221b6 · Mistral3-large · Updated 2025-12-04 16:05:40 +00:00	4147 4042		ZIP TAR.GZ
ik/k_cache_hadamard_cuda 6387a5800a · Minor · Updated 2025-12-04 05:52:05 +00:00	106 2		ZIP TAR.GZ
ik/k_cache_hadamard 9c17d5f176 · WIP: Hadamard transforms for K-cache · Updated 2025-12-03 14:26:46 +00:00	107 1		ZIP TAR.GZ
ik/mistral3_std_attn ab19054a79 · Use standard attention for Ministral3 · Updated 2025-12-03 10:51:32 +00:00	109 1		ZIP TAR.GZ
ik/fix_cuda_scale_bug c5f9a5c29a · Fix bug in ggml_cuda_op_scale_tensor · Updated 2025-12-03 10:28:26 +00:00	110 1		ZIP TAR.GZ
ik/ministral3 84129f7eb6 · Adding ministral3: this seems to work · Updated 2025-12-03 09:41:44 +00:00	4147 4037		ZIP TAR.GZ
ik/graph_alloc dde8028336 · WIP: allocate graph · Updated 2025-12-03 07:54:53 +00:00	111 4		ZIP TAR.GZ
ik/allow_empty_splits b415e734e5 · Fix also output · Updated 2025-12-03 04:53:44 +00:00	111 3		ZIP TAR.GZ
ik/is_this_better_for_multi_gpu 49ec5726d7 · Is this better for multi-GPU and split mode "graph"? · Updated 2025-12-02 08:44:46 +00:00	112 1		ZIP TAR.GZ
ik/slightly_better_graph_split_strategy c4c266847f · Slightly better graph split strategy · Updated 2025-12-02 08:18:55 +00:00	112 1		ZIP TAR.GZ
ik/poc_tp_glm4.5 864b496831 · Try to better distribute the splits · Updated 2025-12-01 13:18:56 +00:00	113 32		ZIP TAR.GZ

... 3 4 5 6 7 ...

Default Branch

Branches