Commit Graph

449 Commits

Author SHA1 Message Date
Liangsheng Yin
1f2da824dd [Benchmark] Remove re-exports from bench_serving.py (#19130) 2026-02-21 14:30:30 -08:00
Alison Shao
f9c3def7fe Fix CI: add flashinfer --download-cubin to install dependencies (#18887)
Co-authored-by: Liangsheng Yin <lsyincs@gmail.com>
2026-02-16 13:50:10 -08:00
Douglas Yang
f1efb46bdd fix: adding performance logging for nightly diffusion (#18023) 2026-02-16 14:09:00 +08:00
SoluMilken
07a24f1a38 update pre-commit config (#18860) 2026-02-16 00:18:31 +08:00
Lianmin Zheng
b33769786f [Auto Sync] Update grpc_request_manager.py, tokenizer_manag... (20260214) (#18838)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2026-02-14 18:12:32 -08:00
shuwenn
3299c4f9c1 [CI] feat: add early exit to wait_for_server when process dies (#18602) 2026-02-13 16:46:09 -08:00
Kangyan-Zhou
eccf875d49 [CI] Revive 8-GPU trace upload in nightly test workflow (#18820)
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-14 08:37:08 +08:00
Mohammad Miadh Angkad
1be41e9036 [FlashInfer] Bump FlashInfer version from 0.6.2 to 0.6.3 (#18448) 2026-02-14 07:43:33 +08:00
Kangyan-Zhou
710d873ba6 Update notified user in post_ci_failures_to_slack.py (#18817) 2026-02-14 06:48:56 +08:00
Ke Bao
a6c4b52ac5 Cleanup unused rerun stages (#18788) 2026-02-13 17:44:42 +08:00
Lianmin Zheng
9815ee934c [Auto Sync] Update weight_utils.py (20260212) (#18692)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Dan Zheng <dzheng@x.ai>
2026-02-12 16:26:05 -08:00
Kangyan-Zhou
1b8f68af57 Fix B200 installation issue (#18725) 2026-02-12 22:06:23 +08:00
Alison Shao
f20b1703ce [CI] Fix torchaudio/torchvision CUDA version mismatch (#18211) 2026-02-11 23:47:32 -08:00
YC Tseng
20554a0a4f [AMD] rocm 7.2 image release, PR test, Nightly Test (#17799)
Co-authored-by: Alan Kao <akao@amd.com>
Co-authored-by: bingxche <Bingxu.Chen@amd.com>
Co-authored-by: Michael <13900043+michaelzhang-ai@users.noreply.github.com>
2026-02-11 21:29:25 -08:00
Alison Shao
7eaf866846 [CI] Install python3-dev for Triton JIT compilation on fresh runners (#18644) 2026-02-11 16:28:57 -08:00
Alison Shao
dcc63dc545 [CI] Guard python3 call in install script for fresh runners (#18609) 2026-02-12 00:05:29 +08:00
Bingxu Chen
316f9cbb35 [AMD] add amd ci monitor (#17476)
Co-authored-by: michaelzhang-ai <michaelzhang-ai@users.noreply.github.com>
Co-authored-by: YC Tseng <yctseng@amd.com>
2026-02-09 09:04:54 -08:00
YC Tseng
28717e3d28 [AMD] CI - Fix AMD daily image release and install dependency (#18452)
Co-authored-by: Bingxu Chen <bingxche@amd.com>
2026-02-08 22:20:09 -08:00
Bingxu Chen
3f3c201243 [AMD] Update aiter to v0.1.10.post2 (#18423)
Co-authored-by: kkHuang-amd <wunhuang@amd.com>
Co-authored-by: YC Tseng <yctseng@amd.com>
2026-02-08 22:08:24 -08:00
Shangming Cai
52401bec1d chore: bump mooncake version to 0.3.9 (#18316)
Signed-off-by: Shangming Cai <csmthu@gmail.com>
2026-02-07 17:30:01 +08:00
Alison Shao
bedade1ef0 Merge stage-c-test-large-4-gpu suites into partitioned suites (#18325) 2026-02-06 15:32:33 -08:00
Zhaoyi Li
8e933e1914 AMD PD/D PR ci (#17183)
Co-authored-by: YC Tseng <yctseng@amd.com>
Co-authored-by: Bingxu Chen <bingxche@amd.com>
Co-authored-by: bingxche <Bingxu.Chen@amd.com>
2026-02-02 23:29:14 -08:00
sunxxuns
47592a23c7 [CI] Fix AMD CI by inlining dummy_grok config (#18044)
Co-authored-by: root <root@mi300x8-005.atl1.do.cpe.ice.amd.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-01 00:20:57 -08:00
Kangyan-Zhou
e5ac6229e1 Fix installation script for H200 runners (#18050) 2026-01-31 23:30:51 -08:00
Alison Shao
a0bae4c343 Migrate 4-GPU/8-GPU workflow jobs to stage-c and add CI registry decorators (#17299) 2026-01-31 22:37:22 -08:00
cswuyg
33c053c50c fix(benchmark): add missing args for speculative decoding benchmark (#17974) 2026-01-29 23:05:42 -08:00
Kangyan-Zhou
2cd2c3118d Add concurrency tracking to runner utilization report (#17963)
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-29 17:31:55 -08:00
Alison Shao
1f75c2af4d Fix /tag-and-rerun-ci to do full rerun when PR has sgl-kernel changes (#17729) 2026-01-29 12:54:30 -08:00
Kangyan-Zhou
c0b4dd68a2 Add a performance dashboard server and frontend for nightly CUDA tests (#17725) 2026-01-27 22:22:33 -08:00
YC Tseng
52bca42870 [AMD] CI - enable deepseekv3.2 on MI325-8gpu and merge perf/accuracy test suites into stage-b suites (#17633)
Co-authored-by: Bingxu Chen <Bingxu.Chen@amd.com>
2026-01-27 18:54:36 -08:00
Hubert Lu
93423ff780 [AMD] Deprecate ROCm 6.3 artifacts and standardize gfx942 on ROCm 7 (#17785) 2026-01-27 15:58:49 -08:00
Liangsheng Yin
8278ef0e68 Pass GPU ids to kill specified devices in script. (#17840)
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2026-01-27 13:52:55 -08:00
monkeyLoveding
d578b41bad [NPU] Adapt cann 8.5: use sfa and lightning indexer op from cann and CI update (#17615)
Co-authored-by: Kelon <kelonlu@163.com>
2026-01-27 19:03:53 +08:00
Douglas Yang
8643fb2f52 fix: remove truncation for test and job names in ci failure monitor (#17765) 2026-01-26 09:46:33 -08:00
Makcum888e
bba6e38ff8 [NPU] Split pyproject npu from pyproject other (#17641) 2026-01-26 09:45:44 -08:00
Douglas Yang
51d139b867 fix: move nightly whl to cuda version folder (#17762) 2026-01-27 00:13:46 +08:00
shaharmor98
f6f1b6d000 Bump FI version (#17700)
Signed-off-by: Shahar Mor <smor@nvidia.com>
Co-authored-by: b8zhong <b8zhong@uwaterloo.ca>
2026-01-26 16:50:06 +08:00
Alison Shao
7b22b8ff8a Fix sgl-kernel install: fail instead of PyPI fallback when artifacts missing (#17728) 2026-01-26 11:46:49 +08:00
Kangyan-Zhou
344eeaee90 Upload nightly test metrics to GH artifacts (#17696) 2026-01-25 14:35:14 -08:00
Makcum888e
64d809937a revert row from https://github.com/sgl-project/sglang/pull/17584/ (#17701) 2026-01-25 12:17:47 +03:00
Makcum888e
d1042e0d62 [Refactore] [CI] Remove redundant CI test runs step 2 (#17584) 2026-01-24 23:39:48 -08:00
Alison Shao
b23470e95a Fix CI install failure when rerunning tests via workflow_dispatch (#17612) 2026-01-23 00:04:16 -08:00
YC Tseng
04a10c9bc2 [AMD] CI - migrate perf test and fix stage-b-test-1-gpu-amd (#17340)
Co-authored-by: Bingxu Chen <bingxche@amd.com>
Co-authored-by: bingxche <Bingxu.Chen@amd.com>
Co-authored-by: michaelzhang-ai <michaelzhang.ai@users.noreply.github.com>
2026-01-22 18:45:05 -08:00
Michael
a3addd6203 [AMD] Add DeepSeek-V3.2 and VLMs model in nightly tests (#17179)
Co-authored-by: michaelzhang-ai <michaelzhang-ai@users.noreply.github.com>
Co-authored-by: YC Tseng <yctseng@amd.com>
Co-authored-by: Bingxu Chen <bingxche@amd.com>
2026-01-19 20:31:56 -08:00
Alison Shao
fb88fb672e fix(ci): rate limit and permission errors in trace publishing (#17238) 2026-01-18 23:20:22 -08:00
Lianmin Zheng
fc4b932f4e Update code sync scripts (#17319) 2026-01-18 19:57:28 -08:00
Alison Shao
7edb06158e Add runner utilization report workflow (#17234) 2026-01-17 19:28:05 -08:00
fzyzcjy
a7b5f75d88 Support integration tests with Redis binary (#17045) 2026-01-17 11:59:04 +08:00
Douglas Yang
d2ec128bbf fix: ci failure monitor reorganization (#17165) 2026-01-16 13:25:13 -08:00
Alison Shao
b4fce9955a Add CI Coverage Overview workflow with detailed test listings (#16842) 2026-01-16 09:42:50 -08:00