Commit Graph

39 Commits

Author SHA1 Message Date
Alison Shao
870a21bf39 [CI] Remove Slack bot from CI failure monitor (#21581)
Co-authored-by: Alison Shao <alison.shao@Mac.attlocal.net>
2026-04-11 20:34:48 -07:00
Kangyan-Zhou
596c34ee04 Update ci_auto_bisect.py to have streak 1 so that all failures will b… (#22161)
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 10:39:19 -07:00
Kangyan-Zhou
edee9ae929 Update ci_auto_bisect.py to use correct model (#22142) 2026-04-04 23:57:52 -07:00
Kangyan-Zhou
8cbeacd783 feat: CI auto-bisect workflow for automated regression analysis (#22119)
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 18:58:18 -07:00
Douglas Yang
1b45d81e91 fix: only showing recent runners from ci failure analysis (#21015) 2026-03-31 20:18:17 -07:00
Lianmin Zheng
193bbf9b66 chore(ci): remove deprecated CI Monitor workflow (#20993) 2026-03-20 00:05:22 -07:00
SoluMilken
07a24f1a38 update pre-commit config (#18860) 2026-02-16 00:18:31 +08:00
Kangyan-Zhou
710d873ba6 Update notified user in post_ci_failures_to_slack.py (#18817) 2026-02-14 06:48:56 +08:00
Douglas Yang
8643fb2f52 fix: remove truncation for test and job names in ci failure monitor (#17765) 2026-01-26 09:46:33 -08:00
Douglas Yang
d2ec128bbf fix: ci failure monitor reorganization (#17165) 2026-01-16 13:25:13 -08:00
Douglas Yang
a1ed247fd7 feature: add runner online count to failure monitor (#16408) 2026-01-04 13:24:04 -08:00
Douglas Yang
8b111b20c3 feature: improvements to CI failure monitor (#16272) 2026-01-02 20:09:41 -08:00
Douglas Yang
f2ccc44240 fix: improving format and design (#15791) 2025-12-25 13:13:27 -08:00
Douglas Yang
21cfebac65 fix: move ci-bot (#15154) 2025-12-14 22:47:19 -08:00
Douglas Yang
8c96fcda70 feature: ci failure monitor slack bot (#15110)
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-14 13:47:15 -08:00
Qiaolin Yu
729529190d [ci] Move dpsk-r1-fp4 b200 test to stage b (#15084) 2025-12-13 23:14:08 -08:00
Baizhou Zhang
ab3ffd1c8e Add nightly accuracy test for DeepSeek V3.2 (#14935) 2025-12-13 12:11:16 -08:00
Douglas Yang
ea91a720d5 feature: ci failure monitor improvements (#15055) 2025-12-13 10:47:52 -08:00
Douglas Yang
5c03aa3e9d Adding section for scheduled PR test runs on main (#14309) 2025-12-02 14:52:44 -08:00
Douglas Yang
253be18e52 Fix nonetype error for ci failure monitor (#14319) 2025-12-02 14:24:25 -08:00
alisonshao
aaa40a9b3b Fix pagination bug in CI monitor preventing performance-test-2-gpu data collection (#13781) 2025-11-23 22:02:30 +08:00
alisonshao
dab06b50ab Fix: CI monitor should not exit with error on regressions (#13694) 2025-11-21 10:01:48 -08:00
alisonshao
5a2c70396e Add nightly test CI monitor workflow (#13038) 2025-11-20 12:55:13 -08:00
Douglas Yang
8900f996aa CI Failure Monitor Improvements (#13558) 2025-11-18 22:55:38 -08:00
Douglas Yang
d7984f3125 Adding CI Monitor Improvements (#13462) 2025-11-17 18:42:35 -08:00
Liangsheng Yin
ab63f3c50b [1/N] CI refactor: introduce CI register. (#13345) 2025-11-17 12:21:20 +08:00
Douglas Yang
03a7e6f4db Add job and runner failure monitor workflow for CI (#13104) 2025-11-12 14:04:50 -08:00
Peng Zhang
012bfc4fdc [9/n] decouple quantization impl from vllm dependency - adjust ci (#12753) 2025-11-10 14:55:19 -08:00
alisonshao
fb9582c4e1 Add multi-GPU configurations to nightly-test.yml (#12585) 2025-11-04 16:46:30 -08:00
Kaixi Hou
c0d02cf4d1 [NVIDIA] Add CI workloads for GB200 (#12242) 2025-10-30 14:32:03 -07:00
Xiaoyu Zhang
d8fcbaa38d [CI Monitor] Fix ci_monitor perf analyzer bug (#12281) 2025-10-30 09:47:12 -07:00
Xiaoyu Zhang
d0cff78f54 [CI] Add ci monitor balance workflow (#11962) 2025-10-25 12:14:36 -07:00
Xiaoyu Zhang
984fbeb16b Revert "[CI Monitor] Ci monitor only deal with main branch in default" (#11846) 2025-10-19 22:06:40 -07:00
Xiaoyu Zhang
24ed3f32c0 fix(ci): Fix CI Monitor limit parameter and add CI Analysis to summary (#11832) 2025-10-19 18:08:34 -07:00
Xiaoyu Zhang
8e51049f56 [CI Monitor] Ci monitor only deal with main branch in default (#11538) 2025-10-13 13:50:04 -07:00
Xiaoyu Zhang
6806c4e63e [CI monitor] Improve CI analyzer: fix job failure tracking and add CUDA-focused filtering (#11505) 2025-10-13 13:31:09 +08:00
Xiaoyu Zhang
6f16bf9d9d [Ci Monitor] Auto uploaded performance data to sglang_ci_data repo (#10976) 2025-09-29 16:17:27 +08:00
Xiaoyu Zhang
2387c22b56 Ci monitor support performance (#10965) 2025-09-27 09:11:21 +08:00
Xiaoyu Zhang
c1f39013b7 [ci feature] add ci monitor (#10872) 2025-09-24 23:16:29 -07:00