Commit Graph

8 Commits

Author SHA1 Message Date
Kawrakow
2ef38b56df CPU optimizations 2026-02-24 16:47:38 +00:00
Kawrakow
dc44a37ca2 Simplify/improve CUDA delta-net 2026-02-24 16:47:38 +00:00
Kawrakow
28b31a66b2 Add command line argument for fused delta net 2026-02-24 16:47:38 +00:00
Kawrakow
a350f1b96f Revive fused delta-net 2026-02-24 16:47:38 +00:00
Kawrakow
38ca19d828 Minor delta-net tweak (#1308)
* Make sure we pick the reduced tensor from the right GPU

* Minor

* Minor delta-net tweak
2026-02-24 15:22:57 +01:00
Kawrakow
5dacb5355a Graph parallel for Qwen3-Next (#1292)
* WIP

* This works, but is slower than split mode layer
2026-02-23 07:58:00 +01:00
Kawrakow
13c3d83ce7 Qwen3.5-MoE support (#1288)
* WIP: loads and runs, but not correct

Very high PPL, empty TG.

* This appears to work
2026-02-21 08:33:06 +01:00
Kawrakow
04cf685e82 Factor out delta net (#1286)
* WIP: factor out delta net implementation

* WIP

* Use the standard FFN functions

* More standard attn for Qwen3-Next
2026-02-18 17:16:17 +01:00