Liangsheng Yin
|
6cc2eee50d
|
[misc] CI hygiene: enforce __main__ entry, drop silent-skipped tests, fix rerun-test protoc (#23305)
|
2026-04-20 21:16:24 -07:00 |
|
Alison Shao
|
6b19e8a452
|
ci: reduce scheduled PR test from 4x to 3x daily (#23313)
|
2026-04-20 20:53:13 -07:00 |
|
ishandhanani
|
90d527195b
|
[CI] Fix nightly docker builds failing on root-owned workspace leftovers (#23279)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
2026-04-20 11:56:33 -07:00 |
|
YC Yen-Ching Tseng
|
da62e90904
|
[AMD] Fix multimodal timeout issue : rocm7.2 PR Test (#23247)
|
2026-04-20 18:36:08 +08:00 |
|
YC Yen-Ching Tseng
|
cf4b84f839
|
[AMD] Update AMD workflow name (#23245)
Co-authored-by: bingxche <bingxche@amd.com>
|
2026-04-20 18:18:24 +08:00 |
|
Kangyan-Zhou
|
1ebe1c57ed
|
[CI] Partition stage-a-test-cpu into 4 matrix shards (#23208)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
2026-04-19 22:07:37 -07:00 |
|
Alex Nails
|
10e17cc55e
|
[gRPC] Native gRPC server: proto + Rust crate scaffold + server args (#22736)
|
2026-04-20 12:39:35 +08:00 |
|
Kangyan-Zhou
|
1d252803f5
|
fix(ci): repair path filters regressed by #21482 (#23201)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
2026-04-19 20:34:57 -07:00 |
|
Thomas
|
3063d640dd
|
[CI] Exclude diffusion-specific paths from main_package filter (#23053)
Co-authored-by: ronnie_zheng <zl19940307@163.com>
|
2026-04-20 10:43:44 +08:00 |
|
Cheng Wan
|
ebcc2b3eec
|
ci: run weekly est_time update on Monday using p90 of last 15 runs (#23120)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
2026-04-19 14:39:27 -07:00 |
|
YC Yen-Ching Tseng
|
32b7777f6c
|
[AMD]Fix AMD multimodal-gen-test-2-gpu timeout by adding partition for standalone test (#23130)
|
2026-04-19 23:16:18 +08:00 |
|
Baizhou Zhang
|
6ecd6f84db
|
[CI] Add per-job uv venv isolation and upgrade CI version to Cuda 13 (#23119)
Co-authored-by: Kangyan Zhou <zky314343421@gmail.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Alison Shao <a.shao@wustl.edu>
Co-authored-by: Mick <mickjagger19@icloud.com>
|
2026-04-19 05:32:36 -07:00 |
|
Xiaoyu Zhang
|
83c5119d01
|
[diffusion] CI: fix ModelOpt B200 CI artifact coverage (#22955)
|
2026-04-17 23:33:42 +08:00 |
|
YC Yen-Ching Tseng
|
f399997d2f
|
[AMD] mirror nightly images to local registry and prefer LAN pulls (#23073)
Co-authored-by: bingxche <bingxche@amd.com>
|
2026-04-17 19:49:26 +08:00 |
|
YC Yen-Ching Tseng
|
8c13295842
|
[AMD] fix AMD CI gate (#22974)
Co-authored-by: bingxche <bingxche@amd.com>
|
2026-04-17 18:32:26 +08:00 |
|
Alison Shao
|
0052093178
|
test(4-gpu-b200): split test_qwen35_models.py + bump partitions 5→6 (#22913)
|
2026-04-16 18:51:59 -07:00 |
|
Makcum888e
|
e353630b57
|
[Diffusion] [NPU] Fix multimodal gen CI (#22879)
|
2026-04-17 04:09:44 +03:00 |
|
ishandhanani
|
f61c332cba
|
ci: log analyzer (#22859)
|
2026-04-15 14:10:00 -07:00 |
|
Mick
|
e95c2e73bd
|
[diffusion] CI: refactor diffusion ci and reduce redundancy (#22810)
|
2026-04-15 10:12:29 +08:00 |
|
Michael
|
39c6bf730c
|
[AMD][CI] Add GLM-5-MXFP4 accuracy and perf nightly tests for MI35x (#21773)
|
2026-04-14 18:55:36 -07:00 |
|
ishandhanani
|
2c9e76d333
|
ci: skip approval for nightly gb200 runs, keep for manual triggers (#22768)
|
2026-04-14 16:34:57 -07:00 |
|
Michael
|
eab045b2b7
|
[AMD] Add MiniMax-M2.7 accuracy and performance nightly tests (#22722)
Co-authored-by: HaiShaw <hixiao@gmail.com>
|
2026-04-14 00:30:11 -07:00 |
|
Sahithi Chigurupati
|
7c1bde2e38
|
[CI] Add optional image input to GB200 nightly workflow_dispatch (#22745)
Co-authored-by: ishandhanani <82981111+ishandhanani@users.noreply.github.com>
|
2026-04-13 23:57:15 -07:00 |
|
YC Yen-Ching Tseng
|
d44eb16ac6
|
[AMD] Replace push trigger with scheduled runs and enable parallel stage execution (#22489)
Co-authored-by: bingxche <bingxche@amd.com>
|
2026-04-14 13:33:29 +08:00 |
|
Sahithi Chigurupati
|
ff61b2e470
|
[CI] Add workflow_dispatch and environment gate to GB200 nightly pipeline (#22733)
|
2026-04-13 17:08:18 -07:00 |
|
Baizhou Zhang
|
b441317aa4
|
Revert "Upgrade CI default CUDA version from 12.9 to 13.0" (#22727)
|
2026-04-13 14:39:24 -07:00 |
|
Alison Shao
|
3f4fbc165d
|
Upgrade CI default CUDA version from 12.9 to 13.0 (#21441)
|
2026-04-12 21:48:40 -07:00 |
|
Prozac614
|
45472d70cc
|
[diffusion] CI: dynamic load-balanced partitioning for diffusion CI (#15528)
Co-authored-by: daiweitao <dwti614707404@163.com>
Co-authored-by: SGLang CI <ci@sglang.ai>
|
2026-04-12 13:02:43 +08:00 |
|
Alison Shao
|
870a21bf39
|
[CI] Remove Slack bot from CI failure monitor (#21581)
Co-authored-by: Alison Shao <alison.shao@Mac.attlocal.net>
|
2026-04-11 20:34:48 -07:00 |
|
Baizhou Zhang
|
2e70e4f4f6
|
[CI] Little renaming of gb200 CI workflow (#22608)
|
2026-04-11 17:52:42 -07:00 |
|
YC Yen-Ching Tseng
|
3ce72252de
|
[AMD] Fix Timeout: stage-b-test-2-gpu-large-amd,stage-b-test-1-gpu-large-amd (#22228)
Co-authored-by: HAI <hixiao@gmail.com>
|
2026-04-10 22:55:44 -07:00 |
|
Sahithi Chigurupati
|
451320596f
|
[CI] Add GB200 nightly perf regression pipeline (#22461)
|
2026-04-10 15:12:24 -07:00 |
|
Cheng Wan
|
3f39b3d811
|
feat: add weekly workflow to update CI test est_time values (#22545)
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
|
2026-04-10 15:03:37 -07:00 |
|
Ratish P
|
cf5ad12612
|
[diffusion][CI]: route multimodal component accuracy through run_suite (#21960)
|
2026-04-10 23:06:03 +08:00 |
|
Alison Shao
|
b853e2c41b
|
[CI] Remove Slack notification from ci-auto-bisect workflow (#22483)
Co-authored-by: Alison Shao <alison.shao@Mac.lan>
|
2026-04-09 20:32:09 -07:00 |
|
ishandhanani
|
3aaaf53f59
|
[Docker] Fix CI docker target after Dockerfile restructure (#22478)
|
2026-04-09 18:53:42 -07:00 |
|
Michael
|
ef6bfc1197
|
[AMD] Add GLM-5.1-FP8 nightly accuracy and performance benchmarks for MI30x and MI35x (#22336)
|
2026-04-08 22:57:43 -07:00 |
|
Liangsheng Yin
|
edfddda192
|
Move runai model loader test to nightly suite (#22418)
|
2026-04-08 21:39:32 -07:00 |
|
tfhddd
|
c431b11d8b
|
[CI] Use UV to improve pip install speed (#22029)
|
2026-04-09 09:18:32 +08:00 |
|
Alison Shao
|
cf27b11498
|
[CI] Increase stage-c-test-4-gpu-b200 partitions from 4 to 5 (#22395)
Co-authored-by: Alison Shao <alison.shao@MacBook-Pro-D2W773R9CD.local>
|
2026-04-08 16:36:27 -07:00 |
|
Kangyan-Zhou
|
cc8ea08b8b
|
[CI] Replace upload/download-artifact with job outputs in release-docker-runtime (#22388)
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-04-08 16:13:33 -07:00 |
|
Xiaoyu Zhang
|
b5b2dbe05f
|
[Diffusion] Add diffusion NVFP4 scaled-mm correctness test (#22127)
Co-authored-by: Mick <mickjagger19@icloud.com>
|
2026-04-08 22:07:24 +08:00 |
|
Alex Nails
|
931dbceadc
|
[CI] Set RUNAI_STREAMER_MEMORY_LIMIT=0 for stage-b-test-1-gpu-small (#22346)
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-04-08 02:23:35 -07:00 |
|
Michael
|
db60a620db
|
[AMD] Add GLM-5-FP8 nightly performance benchmarks for MI30x and MI35x (#21710)
|
2026-04-07 22:43:14 -07:00 |
|
Alison Shao
|
86e4542f35
|
Use dedicated runner label for deepep 8-GPU tests (#22309)
Co-authored-by: Alison Shao <alison.shao@Mac.attlocal.net>
|
2026-04-07 19:58:54 -07:00 |
|
Liangsheng Yin
|
8c3d80eabe
|
Only upload CUDA coredumps on test failure (#22301)
|
2026-04-07 18:07:28 -07:00 |
|
Liangsheng Yin
|
0e2a0260a1
|
Add fast-fail to multimodal-gen CI (#22284)
|
2026-04-07 15:56:12 -07:00 |
|
Mick
|
e7bc23cdab
|
[diffusion] CI: fix consistency check (#22251)
|
2026-04-07 23:43:18 +08:00 |
|
Michael
|
ba78f6e0ef
|
[AMD] Add Qwen3.5-397B FP8 nightly perf benchmarks for MI30x and MI35x (#21669)
|
2026-04-06 23:46:00 -07:00 |
|
Prozac614
|
ef2d4013d7
|
[diffusion] CI: add consistency test (#15236)
Co-authored-by: daiweitao <dwti614707404@163.com>
|
2026-04-07 09:50:23 +08:00 |
|