Default Branch

69fdd041c1 · Remove forgotten unused code · Updated 2026-01-26 12:54:21 +00:00

Branches

8957ff4963 · This fixes it · Updated 2024-10-24 10:16:06 +00:00    ikawrakow

4149
3473

8834177686 · Granite: avoid NaNs on CUDA by scaling Q before K*Q multiplication · Updated 2024-10-22 09:35:05 +00:00    ikawrakow

4149
3472

5f3e6faac8 · Enable q6_0 for flash attention · Updated 2024-10-21 13:30:10 +00:00    ikawrakow

4149
3470

599a2b7806 · Fix typo, which is not really a bug · Updated 2024-10-21 10:12:14 +00:00    ikawrakow

4149
3474

a3fe796f6c · Bitnet: make the scale tensors optional · Updated 2024-10-19 16:37:33 +00:00    ikawrakow

4149
3467

0e76d21b96 · Adding agray3's graph caching approach · Updated 2024-10-18 15:01:08 +00:00    ikawrakow

4149
3465

e732da1f57 · Attempt to blindly fix Windows build failure · Updated 2024-10-18 09:35:47 +00:00    ikawrakow

4149
3465

c4292bf2d9 · iq4_knn: Metal - predictably bad · Updated 2024-10-18 08:48:00 +00:00    ikawrakow

4149
3468

9612cd79d6 · iq4_kss: very slightly faster Metal dot product · Updated 2024-10-16 12:08:15 +00:00    ikawrakow

4149
3473

3e0c2519d3 · iq4_ks: faster dot product on Metal · Updated 2024-10-16 11:04:59 +00:00    ikawrakow

4149
3462

55f91a98f1 · iq3_k: slightly faster Metal dot product · Updated 2024-10-14 07:41:26 +00:00    ikawrakow

4149
3461

f74905d649 · iq2_k: optimize Metal dot product · Updated 2024-10-13 11:09:53 +00:00    ikawrakow

4149
3461

f9f15c27b6 · iq2_ks: faster Metal · Updated 2024-10-13 09:23:14 +00:00    ikawrakow

4149
3470

e441c897a4 · Better model info · Updated 2024-10-10 14:38:59 +00:00    ikawrakow

4149
3456

e734e888e1 · iq3_ks: AVX2 · Updated 2024-10-10 07:48:42 +00:00    ikawrakow

4149
3463

f61c37967a · iq3_kl: use iq4_ks instead of iq4_k/iq4_xs · Updated 2024-10-09 09:50:43 +00:00    ikawrakow

4149
3467

df2bd86a31 · WIP · Updated 2024-10-06 06:09:51 +00:00    ikawrakow

4149
3458

acaa4869af · Move scale fudge factors to quantization · Updated 2024-10-04 13:14:52 +00:00    ikawrakow

4149
3453

a553eb191a · Make the entire project c++17 · Updated 2024-10-04 11:23:21 +00:00    ikawrakow

4149
3453

ed477f1cdc · Do not quantize activations if not necessary also for MoE models · Updated 2024-10-04 08:11:02 +00:00    ikawrakow

4149
3452