muse-coder
|
91230dcca8
|
[FIX] Correct JIT kernel compilation on newer GPUs with outdated driver metadata. (#18496)
Co-authored-by: Xiaoyu Zhang <35585791+BBuf@users.noreply.github.com>
|
2026-02-15 12:14:39 +08:00 |
|
Bhavneek Singh
|
1ce3420784
|
Model: Support IBM Granite (Dense/Mamba + MoE) (#18040)
|
2026-02-15 11:24:41 +08:00 |
|
Lianmin Zheng
|
b33769786f
|
[Auto Sync] Update grpc_request_manager.py, tokenizer_manag... (20260214) (#18838)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
|
2026-02-14 18:12:32 -08:00 |
|
Guangda Liu
|
190fa8246f
|
Fix model loading for DeepSeek-V3.2-AWQ (#16907)
Co-authored-by: Guangda Liu <bingps@users.noreply.github.com>
|
2026-02-15 09:39:53 +08:00 |
|
Lianmin Zheng
|
8b2020584c
|
[Auto Sync] Update test_deterministic.py (20260214) (#18839)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Jiayi Yuan <34369239+jy-yuan@users.noreply.github.com>
|
2026-02-14 17:19:30 -08:00 |
|
Xiaoyu Zhang
|
4067d9487d
|
[diffusion] feat: opt vae decode with channels_last_3d (#18540)
|
2026-02-14 23:19:45 +08:00 |
|
Xiaoyu Zhang
|
c29394e3c8
|
[kernel slimming] Move fast_hadamard_transform to jit_kernel (#18475)
|
2026-02-14 23:06:21 +08:00 |
|
Kangyan-Zhou
|
ae95869292
|
Enable SGLANG_ENABLE_SPEC_V2 for nightly speculative decoding tests (#18719)
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
|
2026-02-14 23:00:33 +08:00 |
|
Raayan Dhar
|
92cdd398cd
|
feat: Support mrope_section with rope_type: "yarn" (#13313)
Signed-off-by: Raayan Dhar raayan.dhar@gmail.com <raayan.dhar@gmail.com>
Signed-off-by: raayandhar <raayan.dhar@gmail.com>
|
2026-02-14 22:51:44 +08:00 |
|
Ke Bao
|
f51e9d9ca1
|
Add ci test for ring model (#18829)
|
2026-02-14 22:20:23 +08:00 |
|
ybyang
|
c8aa2a6534
|
Fix dsv32 encode_messages (#18126)
|
2026-02-14 16:44:13 +08:00 |
|
Johnsonms
|
34132d6da5
|
Kernel: optimize decoding metadata in NSA multi-spec backend with fused kernels (#17554)
|
2026-02-14 16:40:15 +08:00 |
|
Yuan Luo
|
fa0ef6e4f7
|
[VLM][LLM] Optimize fused_moe triton kernel tma (#18782)
Co-authored-by: luoyuan.luo <luoyuan.luo@antgroup.com>
|
2026-02-14 14:35:26 +08:00 |
|
JD
|
f6c18c3a85
|
Fix/partial gen from waiting queue miss metadata (#17610)
|
2026-02-13 19:04:08 -08:00 |
|
R0CKSTAR
|
45a4697d45
|
[diffusion][MUSA] fix: MUSA platform breakage caused by PR #13662 (#18456)
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
|
2026-02-14 11:00:39 +08:00 |
|
qmzznbxhl
|
066b0b70d9
|
Handle abort for retracted requests in disagg decode prealloc queue (#18705)
Co-authored-by: sunhailiang <sunhailiang@baidu.com>
Co-authored-by: Liangsheng Yin <lsyincs@gmail.com>
|
2026-02-13 18:39:39 -08:00 |
|
shuwenn
|
bd39de7d5e
|
[Env] centralize hicache vars in environ.py (#17204)
|
2026-02-13 18:02:31 -08:00 |
|
Liangsheng Yin
|
dcea74d63f
|
Add timeout abort kits for normal / eagle. (#18815)
|
2026-02-13 17:57:30 -08:00 |
|
Liangsheng Yin
|
4474fb98b4
|
[PD-Disagg] Fix double free when prebuilt batch is aborted. (#18822)
|
2026-02-13 17:46:35 -08:00 |
|
Leon Gao
|
ab0fb248fd
|
feat: add SGLANG_DISTRIBUTED_INIT_METHOD_OVERRIDE env var (#18743)
|
2026-02-14 09:37:33 +08:00 |
|
Minglei Zhu
|
8be18c655d
|
[Perf] refactor piecewise cuda graph support of Qwen3-Next (#17613)
|
2026-02-14 09:30:50 +08:00 |
|
shuwenn
|
3299c4f9c1
|
[CI] feat: add early exit to wait_for_server when process dies (#18602)
|
2026-02-13 16:46:09 -08:00 |
|
Mohammad Miadh Angkad
|
1be41e9036
|
[FlashInfer] Bump FlashInfer version from 0.6.2 to 0.6.3 (#18448)
|
2026-02-14 07:43:33 +08:00 |
|
JD
|
191d354f53
|
fix double-free kv cache for requests that have already finished and been freed during preemption (#18694)
|
2026-02-13 13:17:44 -08:00 |
|
Lianmin Zheng
|
008ea46af1
|
[Auto Sync] Update loader.py, weight_utils.py (20260213) (#18779)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Xiuyu Li <xiuyu@x.ai>
Co-authored-by: Cheng Wan <54331508+ch-wan@users.noreply.github.com>
|
2026-02-13 12:22:50 -08:00 |
|
Qi Jia
|
4c6afbeeaa
|
[bugfix] fix mamba slot leak when scheduling fails with radix cache (#15840) (#16067)
Co-authored-by: yizhang2077 <1109276519@qq.com>
|
2026-02-13 23:43:57 +08:00 |
|
dongjiyingdjy
|
8b4c364960
|
refactor context parallel state (#17213)
Co-authored-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
|
2026-02-13 23:18:17 +08:00 |
|
Linyu Wu
|
0012d6a4eb
|
[Kernel Slimming] Migrate GPTQ-Marlin repack kernel to JIT (#18543)
Co-authored-by: Xiaoyu Zhang <35585791+BBuf@users.noreply.github.com>
|
2026-02-13 22:29:22 +08:00 |
|
Mick
|
37273408eb
|
[diffusion] chore: use batched P2P ops in VAE parallel decoding (#18728)
|
2026-02-13 22:11:20 +08:00 |
|
triple-mu
|
acc940d302
|
[diffusion] fix typo (#18790)
|
2026-02-13 21:59:39 +08:00 |
|
R0CKSTAR
|
07633349c9
|
[diffusion] fix: webui task_type check (#18462)
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
Co-authored-by: Mick <mickjagger19@icloud.com>
|
2026-02-13 21:19:16 +08:00 |
|
Mick
|
efdd676d56
|
[diffusion] refactor: merge redundant default_dtype and param_dtype parameters in FSDP loader (#18789)
|
2026-02-13 21:18:02 +08:00 |
|
Kaixi
|
98ad284ebf
|
Added cuda availability guard (#18480)
|
2026-02-13 20:18:34 +08:00 |
|
Ke Bao
|
a0ebaa6498
|
Cleanup debug log for Ring model (#18793)
|
2026-02-13 18:36:20 +08:00 |
|
Ke Bao
|
eacab2868a
|
Adjust mamba cache allocation (#18786)
|
2026-02-13 18:06:23 +08:00 |
|
Yinghai Lu
|
e4b2b57620
|
[schedule] Fix streaming return of customized_info (#18654)
|
2026-02-13 17:19:16 +08:00 |
|
Xinwei Qiang
|
356e338607
|
[diffusion] feat: support SparseVideoGen2 attention backend (#17507)
Co-authored-by: Mick <mickjagger19@icloud.com>
|
2026-02-13 16:20:46 +08:00 |
|
ant-yy
|
d97eb111a3
|
Support LingV2_5 model (#18598)
Co-authored-by: zhangkaihong.zkh <zhangkaihong.zkh@antgroup.com>
Co-authored-by: 有禾 <zhangdonghao.zdh@antgroup.com>
Co-authored-by: yudian0504 <138860534+yudian0504@users.noreply.github.com>
Co-authored-by: 悠扬 <youyang.zmy@antgroup.com>
Co-authored-by: xinxingyang <xinxing.yangxx@antgroup.com>
Co-authored-by: zmy460290 <zmy460290@antgroup.com>
|
2026-02-13 16:09:15 +08:00 |
|
Xiaoyu Zhang
|
013a199bc6
|
[CI] Skip cutedsl gdn performance test in jit_kernel ci (#18783)
|
2026-02-13 15:49:30 +08:00 |
|
Shangming Cai
|
1f39bf6523
|
[Bugfix] Add warnings when NSA indexer cache indice mismatch in PD module (#18727)
|
2026-02-13 15:20:05 +08:00 |
|
Liangsheng Yin
|
e6f7a372ef
|
Rename request timeout env vars for waiting/running stages (#18766)
|
2026-02-12 22:58:40 -08:00 |
|
xiaoye
|
5700b19cbf
|
[diffusion] feat: support tp for qwen-image-edit-2511 (#18619)
Co-authored-by: Mick <mickjagger19@icloud.com>
|
2026-02-13 13:04:29 +08:00 |
|
Liangsheng Yin
|
d29e331491
|
[Spec] Move forward timeout before verify to fix Eagle v1 filter mismatch (#18760)
|
2026-02-12 20:58:34 -08:00 |
|
pansicheng
|
7d4ae057ec
|
[Kernel] Add JIT rotary_embedding_kernel (#17934)
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Xiaoyu Zhang <35585791+BBuf@users.noreply.github.com>
Co-authored-by: root <root@zhikuan-A10x2.ea134>
|
2026-02-13 12:41:25 +08:00 |
|
Bhavneek Singh
|
32e0286829
|
[diffusion] fix: fixe local model loading issue in bench_serving (#18687)
Co-authored-by: Bhavneek Singh <blazingbhavneek@Bhavneeks-MacBook-Air.local>
Co-authored-by: Mick <mickjagger19@icloud.com>
|
2026-02-13 11:57:52 +08:00 |
|
HuangJi
|
f4d80f9d42
|
[diffusion] feat: allows quality adjustment of generated images/videos (#17937)
|
2026-02-13 11:56:20 +08:00 |
|
Bingxu Chen
|
6555b2a71d
|
[diffusion] fix: fix ci on amd (#18716)
Co-authored-by: Mick <mickjagger19@icloud.com>
|
2026-02-13 11:51:24 +08:00 |
|
Lianmin Zheng
|
c56a5efbaa
|
[Auto Sync] Update grok.py (20260213) (#18765)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Cheng Wan <54331508+ch-wan@users.noreply.github.com>
|
2026-02-12 18:41:41 -08:00 |
|
Lianmin Zheng
|
d5f66fec15
|
Revert changes to weight_utils.py (#18759)
|
2026-02-12 17:15:16 -08:00 |
|
Alison Shao
|
dd77bd4651
|
Fix invalid import paths in glm_image.py (#18757)
|
2026-02-12 16:44:34 -08:00 |
|