yurko
|
b5c9554a88
|
common: add qwen3next fused-delta runtime flag
|
2026-02-08 01:15:38 -08:00 |
|
yurko
|
bd0dd7804b
|
docs: reconcile qwen3next status and remaining upstream gaps
|
2026-02-08 01:12:40 -08:00 |
|
yurko
|
a822db6f18
|
qwen3next: add unified regression runner script
|
2026-02-08 01:02:40 -08:00 |
|
yurko
|
691df60037
|
qwen3next: add absolute sanity guards to fused regression
|
2026-02-08 00:54:14 -08:00 |
|
yurko
|
55270b0f98
|
qwen3next: integrate fused regression into eval harness
|
2026-02-08 00:40:55 -08:00 |
|
yurko
|
44db3947a1
|
qwen3next: add fused-delta regression runner script
|
2026-02-08 00:13:18 -08:00 |
|
yurko
|
64099e71c0
|
qwen3next: make fused delta safe by default and fix fused tensor layout
|
2026-02-08 00:06:29 -08:00 |
|
yurko
|
81e788e2f6
|
docs: refresh qwen3next perf review and benchmark matrix
|
2026-02-07 17:31:17 -08:00 |
|
yurko
|
6db8dc86ca
|
qwen3next: split cpu/cuda eval builds and tune PP scheduling
|
2026-02-06 19:28:17 -08:00 |
|
Yurko
|
e64b43392f
|
cuda: reduce qwen3next moe/ssm sync overhead and refresh eval
|
2026-02-06 14:46:59 +00:00 |
|
yurko
|
c767cfa1d3
|
docs: update qwen3next perf report for cuda MoE/SSM tuning
|
2026-02-06 13:52:54 +00:00 |
|
yurko
|
9fbb50481e
|
qwen3next: optimize broadcast sub and single-seq ssm conv
|
2026-02-06 12:50:43 +00:00 |
|