Default Branch

69fdd041c1 · Remove forgotten unused code · Updated 2026-01-26 12:54:21 +00:00

Branches

38eb7fa499 · q6_0: this is slightly better · Updated 2024-10-02 15:07:55 +00:00    ikawrakow

4149
3451

a8e932b734 · Fused y*unary(x) op: Metal · Updated 2024-10-02 13:51:29 +00:00    ikawrakow

4149
3452

037bbd2d58 · q6_0: can now be used for kv-cache on Metal · Updated 2024-10-02 11:54:25 +00:00    ikawrakow

4149
3457

1fb3115412 · iq4_nl: faster quantization · Updated 2024-10-02 04:43:09 +00:00    ikawrakow

4149
3447

5b6999970e · Fix Q5_0 flash attention · Updated 2024-10-01 12:49:03 +00:00    ikawrakow

4149
3446

09789d017f · Be able to use IQ4_NL for KV cache on ARM_NEON · Updated 2024-10-01 11:43:33 +00:00    ikawrakow

4149
3445

f265260f23 · Merge remote-tracking branch 'origin/main' into ik/cuda_faster_iq4nl_kvcache · Updated 2024-10-01 09:26:53 +00:00    ikawrakow

4149
3445

a6b097c1b1 · Fix AVX2 · Updated 2024-10-01 07:54:58 +00:00    ikawrakow

4149
3443

cd1002670c · POC SVD: try involving the quantized weights. · Updated 2024-10-01 05:58:42 +00:00    ikawrakow

4149
3448

5f3f3bb09e · iqk_mul_mat: better srategy when nrc_y not divisible by ny · Updated 2024-10-01 05:12:29 +00:00    ikawrakow

4149
3441

d12d0e9b04 · Allow bf16 kv-cache · Updated 2024-09-29 05:42:33 +00:00    ikawrakow

4149
3440

c294485f45 · Time to fix replace_all · Updated 2024-09-28 14:43:54 +00:00    ikawrakow

4149
3439

147f9606d0 · CUDA non-contiguous RoPE · Updated 2024-09-28 11:37:28 +00:00    ikawrakow

4149
3438

05cb629007 · GGML_UNARY_OP_SWIGLU: cleanup · Updated 2024-09-28 10:36:27 +00:00    ikawrakow

4149
3441

a8f37b61ee · Better sub-3-bit quantization mixes with a qkv tensor · Updated 2024-09-28 05:09:42 +00:00    ikawrakow

4149
3436

d913611605 · Play with barriers · Updated 2024-09-25 16:04:11 +00:00    ikawrakow

4149
3437

0bade93228 · Update IQ1_TN and IQ2_TN bpw shown to user · Updated 2024-09-25 10:27:39 +00:00    ikawrakow

4149
3443

95d9f3c103 · Use fp32 for K*Q in Metal FA implementation · Updated 2024-09-25 10:04:10 +00:00    ikawrakow

4149
3434

75ac624a7a · Fix warnings in iqk_quantize.cpp · Updated 2024-09-17 11:22:37 +00:00    ikawrakow

4149
3435

5065dcd4a0 · Playing with hsums · Updated 2024-09-17 09:12:54 +00:00    ikawrakow

4149
3435