Kawrakow
|
2ef38b56df
|
CPU optimizations
|
2026-02-24 16:47:38 +00:00 |
|
Kawrakow
|
dc44a37ca2
|
Simplify/improve CUDA delta-net
|
2026-02-24 16:47:38 +00:00 |
|
Kawrakow
|
28b31a66b2
|
Add command line argument for fused delta net
|
2026-02-24 16:47:38 +00:00 |
|
Kawrakow
|
a350f1b96f
|
Revive fused delta-net
|
2026-02-24 16:47:38 +00:00 |
|
Kawrakow
|
38ca19d828
|
Minor delta-net tweak (#1308)
* Make sure we pick the reduced tensor from the right GPU
* Minor
* Minor delta-net tweak
|
2026-02-24 15:22:57 +01:00 |
|
Kawrakow
|
5dacb5355a
|
Graph parallel for Qwen3-Next (#1292)
* WIP
* This works, but is slower than split mode layer
|
2026-02-23 07:58:00 +01:00 |
|
Kawrakow
|
13c3d83ce7
|
Qwen3.5-MoE support (#1288)
* WIP: loads and runs, but not correct
Very high PPL, empty TG.
* This appears to work
|
2026-02-21 08:33:06 +01:00 |
|
Kawrakow
|
04cf685e82
|
Factor out delta net (#1286)
* WIP: factor out delta net implementation
* WIP
* Use the standard FFN functions
* More standard attn for Qwen3-Next
|
2026-02-18 17:16:17 +01:00 |
|