Yuhao Yang
|
2f06867128
|
Optimize MHC pipeline: DeepGemm, fused norm, fused hc_head (#24775)
Co-authored-by: Cheng Wan <chwan@rice.edu>
Co-authored-by: Chunan Zeng <zcnrex@gmail.com>
|
2026-05-10 19:03:37 +08:00 |
|
Baizhou Zhang
|
ef5e9f8aba
|
[DSV4] Cherry pick missing commits from deepseek_v4 branch and enhance tests (#24793)
Co-authored-by: Xinyuan Tong <115166877+JustinTong0323@users.noreply.github.com>
Co-authored-by: yueming-yuan <yym022502@gmail.com>
|
2026-05-09 04:15:37 -07:00 |
|
Liangsheng Yin
|
ba625d5290
|
slash command rerun UX: emoji semantics + result writeback (#24802)
|
2026-05-09 03:19:24 -07:00 |
|
Alison Shao
|
aefd8e257f
|
Re-land #23109: rebase-required mode + fix for grep-no-match abort (#24180)
|
2026-05-08 15:28:57 -07:00 |
|
Alison Shao
|
094b90b1ec
|
ci: drop 1-gpu-h100-h200 shared label (#24495)
|
2026-05-06 01:02:31 -07:00 |
|
Liangsheng Yin
|
53df43d0a3
|
rerun-test: route deepep h200 suite to deepep runner (#24325)
|
2026-05-03 15:57:53 -07:00 |
|
Alison Shao
|
694ef516cb
|
Revert "[ci] split stage-c-test-4-gpu-b200 to enable a low-disk runner pool" (#24163)
Co-authored-by: Alison Shao <alisonshao@radixark.ai>
|
2026-04-30 15:57:19 -07:00 |
|
Kangyan-Zhou
|
d4040e7010
|
[CI] Broaden stage-b-test-1-gpu-large runner pool to H100 + H200 (#24080)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
2026-04-29 12:18:10 -07:00 |
|
shuwenn
|
03147f66b8
|
ci: add /rerun-group to rerun all registered tests in a group (#24023)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
2026-04-29 10:24:16 -07:00 |
|
Kangyan-Zhou
|
c689f774a4
|
[CI] /rerun-stage: fix workflow-run URL lookup for sgl-kernel PRs (#23510)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
2026-04-22 17:48:38 -07:00 |
|
Kangyan-Zhou
|
14ac14287c
|
[CI] /rerun-stage: auto-include wheel build when PR modifies sgl-kernel/ (#23492)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
2026-04-22 11:28:06 -07:00 |
|
Jia Guo
|
286fba2073
|
ci: use rerun_failed_jobs for skipped workflows in /rerun-failed-ci (#23008)
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-04-21 23:59:15 -07:00 |
|
Alison Shao
|
04b1caf75b
|
ci: enable /rerun-test for multimodal gen PR tests (#22828)
|
2026-04-21 21:34:14 -07:00 |
|
Kangyan-Zhou
|
77fd86f89e
|
[ci] split stage-c-test-4-gpu-b200 to enable a low-disk runner pool (#23417)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
2026-04-21 18:33:33 -07:00 |
|
Alison Shao
|
0e165ffbfc
|
ci: enable /rerun-test for nightly test suites (#22830)
|
2026-04-21 18:28:10 -07:00 |
|
Mick
|
e95c2e73bd
|
[diffusion] CI: refactor diffusion ci and reduce redundancy (#22810)
|
2026-04-15 10:12:29 +08:00 |
|
Jia Guo
|
bc16130a17
|
ci: skip full rerun when sgl-kernel wheel already built (#22534)
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
|
2026-04-13 20:32:55 -07:00 |
|
Ratish P
|
cf5ad12612
|
[diffusion][CI]: route multimodal component accuracy through run_suite (#21960)
|
2026-04-10 23:06:03 +08:00 |
|
Liangsheng Yin
|
5118295f7b
|
[CI] Support CPU stage and auto-batch same-stage files in /rerun-test (#22081)
|
2026-04-03 15:56:54 -07:00 |
|
Mick
|
838f815e9f
|
[diffusion] CI: temporarily disable accuracy ci (#22031)
|
2026-04-03 17:39:29 +08:00 |
|
Prozac614
|
24997fe42c
|
[diffusion] CI: add initial nvfp4 ci test for b200 (#21767)
Co-authored-by: Mick <mickjagger19@icloud.com>
|
2026-04-02 11:31:08 +08:00 |
|
Ratish P
|
4f5b55e379
|
[diffusion][CI]: Add individual component accuracy CI for diffusion models (#18709)
Co-authored-by: Xiaoyu Zhang <35585791+BBuf@users.noreply.github.com>
|
2026-04-01 21:51:36 +08:00 |
|
Ke Bao
|
acd37d8701
|
[CI] Fix rerun-test suite detection to skip commented registrations (#21753)
|
2026-03-31 18:00:53 +08:00 |
|
Ke Bao
|
2456889f98
|
Rename rerun-ut to rerun-test (#21747)
|
2026-03-31 17:31:55 +08:00 |
|
Liangsheng Yin
|
e1ee68d0fc
|
Release mm features on session close and support multiple /rerun-ut specs (#21501)
|
2026-03-26 18:31:29 -07:00 |
|
Liangsheng Yin
|
9dc266adb4
|
Fix concurrent /rerun-ut posting duplicate workflow URLs (#21495)
|
2026-03-26 16:26:00 -07:00 |
|
Lianmin Zheng
|
814202704b
|
ci: unify PR test suite naming (#21187)
|
2026-03-23 00:18:45 -07:00 |
|
Liangsheng Yin
|
d9f5c2179c
|
ci(slash-cmd): allow write-permission users to /rerun-ut on fork PRs (#21121)
|
2026-03-22 00:45:48 -07:00 |
|
Liangsheng Yin
|
1e97864d75
|
ci(slash-cmd): allow repo write-permission users to /rerun-ut (#21120)
|
2026-03-22 00:32:29 -07:00 |
|
Alison Shao
|
b7a1ae4fac
|
Fix /rerun-stage dispatch failure for non-AMD stages (#21076)
Co-authored-by: Alison Shao <alison.shao@Mac.attlocal.net>
|
2026-03-20 23:48:29 -07:00 |
|
Lianmin Zheng
|
2d7a262ca3
|
ci: rename 1/2-gpu-runner labels to 1/2-gpu-h100 (#21008)
|
2026-03-20 06:04:15 -07:00 |
|
Lianmin Zheng
|
c1da420799
|
ci: run Stage A CUDA tests as stage-a-test-small-1-gpu on 5090 (#20988)
|
2026-03-20 02:55:16 -07:00 |
|
Liangsheng Yin
|
5f1bfb0d28
|
[Security] Fix /rerun-ut bypassing run-ci gate for fork PRs (#20424)
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-03-12 02:24:29 -07:00 |
|
Liangsheng Yin
|
c7ffbf25e9
|
[CI] Fix rerun-ut workflow: add DeepEP install, RDMA env, Blackwell detection (#19803)
|
2026-03-03 15:17:16 -08:00 |
|
Liangsheng Yin
|
1135e214b3
|
[CI] support /rerun-ut command in slash handler (#19800)
|
2026-03-03 14:10:49 -08:00 |
|
Michael
|
1b79934d34
|
[AMD] Fix AMD CI test of TestToolChoiceLfm2Moe (#19113)
Co-authored-by: michaelzhang-ai <michaelzhang-ai@users.noreply.github.com>
Co-authored-by: bingxche <Bingxu.Chen@amd.com>
Co-authored-by: yctseng0211 <yctseng@amd.com>
|
2026-02-27 10:18:15 -08:00 |
|
Alison Shao
|
2c856c6d27
|
Allow PR authors to use /rerun-failed-ci on their own PRs (#19496)
Co-authored-by: Alison Shao <alisonshao@MacBook-Pro-D2W773R9CD.local>
|
2026-02-27 10:14:57 -08:00 |
|
Ke Bao
|
a6c4b52ac5
|
Cleanup unused rerun stages (#18788)
|
2026-02-13 17:44:42 +08:00 |
|
Alison Shao
|
bedade1ef0
|
Merge stage-c-test-large-4-gpu suites into partitioned suites (#18325)
|
2026-02-06 15:32:33 -08:00 |
|
Alison Shao
|
a0bae4c343
|
Migrate 4-GPU/8-GPU workflow jobs to stage-c and add CI registry decorators (#17299)
|
2026-01-31 22:37:22 -08:00 |
|
Alison Shao
|
1f75c2af4d
|
Fix /tag-and-rerun-ci to do full rerun when PR has sgl-kernel changes (#17729)
|
2026-01-29 12:54:30 -08:00 |
|
YC Tseng
|
52bca42870
|
[AMD] CI - enable deepseekv3.2 on MI325-8gpu and merge perf/accuracy test suites into stage-b suites (#17633)
Co-authored-by: Bingxu Chen <Bingxu.Chen@amd.com>
|
2026-01-27 18:54:36 -08:00 |
|
Makcum888e
|
d1042e0d62
|
[Refactore] [CI] Remove redundant CI test runs step 2 (#17584)
|
2026-01-24 23:39:48 -08:00 |
|