Branches - ik_llama.cpp - Public git mirror

ikawrakow/ik_llama.cpp

mirror of https://github.com/ikawrakow/ik_llama.cpp.git synced 2026-01-26 09:09:50 +00:00

main

30381fc1fc · Faster hybrid inference when shared experts (#1191) · Updated 2026-01-26 05:22:05 +00:00

ik/faster_iq3_iq5_quantize de818b77d6 · iq3_k, iq5_k: faster quantization · Updated 2024-08-05 05:13:53 +00:00 ikawrakow	4147 3380		ZIP TAR.GZ
ik/faster_iq4k_quantize 30c002d22d · iq4_k: speedup quantization by a factor of ~2 · Updated 2024-08-03 16:32:43 +00:00 ikawrakow	4147 3379		ZIP TAR.GZ
ik/iq2_k 7b3b413fe0 · Add copyright notice · Updated 2024-07-31 13:06:32 +00:00 ikawrakow	4147 3378		ZIP TAR.GZ
ik/iq4_k b29f64ea70 · iq4_k: scalar dot product · Updated 2024-07-28 10:09:28 +00:00 ikawrakow	4147 3355		ZIP TAR.GZ
ik/fuse_mul_mat_scale 473e280500 · Fusing a mat mul op followed by scale op on the CPU · Updated 2024-07-27 07:45:56 +00:00 ikawrakow	4147 3349		ZIP TAR.GZ
ik/merge_July_26_2024 573e5007cd · Remove check · Updated 2024-07-26 15:00:26 +00:00 ikawrakow	4147 3350		ZIP TAR.GZ
ik/bitnet_token_embedding_gpu_2 ccdb948329 · Offload Bitnet token embeddings to the GPU - the right way · Updated 2024-07-26 10:50:41 +00:00 ikawrakow	4147 3346		ZIP TAR.GZ
ik/bitnet_token_embedding_gpu db6b0f6dab · Update README with the new CUDA/Meat performance · Updated 2024-07-26 07:06:22 +00:00 ikawrakow	4147 3346		ZIP TAR.GZ
ik/mul_mat_ext 86d94862ae · iqk_soft_max · Updated 2024-07-22 14:34:42 +00:00 ikawrakow	4147 3329		ZIP TAR.GZ
ik/new_iq1bn 7024ecfeb4 · iq1bn: faster AVX2 · Updated 2024-07-17 07:17:05 +00:00 ikawrakow	4147 3320		ZIP TAR.GZ

... 30 31 32 33 34