ik_llama.cpp/ggml at ik/fix_add_bf16_turing - ik_llama.cpp - Public git mirror

ikawrakow/ik_llama.cpp

mirror of https://github.com/ikawrakow/ik_llama.cpp.git synced 2026-01-26 17:20:01 +00:00

Files

History

Kawrakow 5c1c0e2bad Prevent using NCCL if graph reduce type is bf16 and arch < AMPERE

2026-01-19 09:25:20 +00:00

..

Merge mainline llama.cpp (#3 )

2024-07-27 07:55:01 +02:00

server: improve speed of speculative decoding (#1119 )

2026-01-10 08:01:22 +02:00

Prevent using NCCL if graph reduce type is bf16 and arch < AMPERE

2026-01-19 09:25:20 +00:00

.gitignore

Merge mainline llama.cpp (#3 )

2024-07-27 07:55:01 +02:00

CMakeLists.txt

CUDA: compress-mode size (#1110 )

2026-01-07 18:33:17 +02:00