Commit Graph

190 Commits

Author SHA1 Message Date
mispa-ms
d8d9d32b29 [docker] Fix stray backslash dropping sgl-model-gateway COPY (#23097)
Signed-off-by: misunp <misunp@nvidia.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 13:44:05 -07:00
ishandhanani
6f6843c582 [Docker] Move Rust toolchain install to torch_deps stage (#23278) 2026-04-20 13:13:10 -07:00
Alex Nails
332ec5e5ee [release] install rust toolchain in main dockerfile (#23014) 2026-04-20 09:50:08 -07:00
Alexis MacAskill
e15401ee0e Add runai-model-streamer into Python packages installed in Dockerfile and fix NotADirectoryError Docker regression (#22537) 2026-04-14 16:25:41 -07:00
Mohammad Miadh Angkad
90ef8ce54d [Docker] Remove flashinfer cache copy (#22653) 2026-04-13 09:48:22 -07:00
Mohammad Miadh Angkad
701a0e0c25 [CI/Docker] Clean up redundant flashinfer cubin downloads (#22491) 2026-04-12 12:30:41 -07:00
ishandhanani
aa103eab8d [Docker] Optimize Dockerfile for BuildKit layer caching (#22160) 2026-04-09 15:34:57 -07:00
Kangyan-Zhou
9d905efa2c [Docker] Fix Trivy CVEs, cubin download 403s, and kernels command order (#22322)
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 12:26:22 -07:00
sglang-bot
df3275bd6c chore: bump flashinfer version to 0.6.7.post3 (#22382)
Co-authored-by: sglang-bot <sglang-bot@users.noreply.github.com>
2026-04-08 14:49:45 -07:00
Rain Jiang
1a8eb890f6 Kernels community fa3 (#20796) 2026-04-07 12:48:44 -07:00
sglang-bot
46bf19cdab chore: bump flashinfer version to 0.6.7.post2 (#22097)
Co-authored-by: sglang-bot <sglang-bot@users.noreply.github.com>
2026-04-04 02:16:25 -07:00
sglang-bot
84118acf50 chore: bump sglang-kernel version to 0.4.1 (#22009)
Co-authored-by: sglang-bot <sglang-bot@users.noreply.github.com>
2026-04-03 13:58:35 -07:00
sglang-bot
ca3ba05a7a chore: bump flashinfer version to 0.6.7 (#21422)
Co-authored-by: sglang-bot <sglang-bot@users.noreply.github.com>
Co-authored-by: Baizhou Zhang <sobereddiezhang@gmail.com>
2026-03-31 21:18:16 -07:00
Kangyan-Zhou
ea6b22fb85 Fix CVEs in Docker image: pillow, linux-libc-dev, and broken sgl-model-gateway build (#21789)
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 20:07:15 -07:00
Rain Jiang
cb1e63aba4 bump fa4 to official released fa4 pkg (#20303) 2026-03-17 17:22:56 -07:00
Xiaoyu Zhang
15097c5c3b Release sglang kernel 0.4.0 (#20440)
Co-authored-by: Baizhou Zhang <sobereddiezhang@gmail.com>
2026-03-16 20:34:58 +08:00
sglang-bot
93afe15b43 chore: bump flashinfer version to 0.6.6 (#20480)
Co-authored-by: sglang-bot <sglang-bot@users.noreply.github.com>
2026-03-14 13:05:10 -07:00
Rain Jiang
ab4b863546 fix ci by removing nvidia-cutlass-dsl-libs-base and force reinstall n… (#20380) 2026-03-11 13:37:33 -07:00
Mohammad Miadh Angkad
6822941514 [FlashInfer] Bump FlashInfer version from 0.6.3 to 0.6.4 (#19005) 2026-03-02 16:12:09 -08:00
Mohammad Miadh Angkad
1be41e9036 [FlashInfer] Bump FlashInfer version from 0.6.2 to 0.6.3 (#18448) 2026-02-14 07:43:33 +08:00
Shangming Cai
52401bec1d chore: bump mooncake version to 0.3.9 (#18316)
Signed-off-by: Shangming Cai <csmthu@gmail.com>
2026-02-07 17:30:01 +08:00
ishandhanani
8f8c1724ae docker: add patch to increase GPU deepep timeout (#18298) 2026-02-05 18:26:15 +08:00
ishandhanani
0a6925639b ci: improve docker for cu13 builds (#18194) 2026-02-03 11:09:38 -08:00
Mohammad Miadh Angkad
25508d11c0 [Docker] Remove hardcoded America/Los_Angeles timezone, default to UTC (#18121) 2026-02-02 23:22:15 -08:00
ZhenshengWu
71babdef51 Fix CUDA 12 dependency when importing Mooncake in official CUDA 13.x image (#17540)
Co-authored-by: wuzhensheng01 <wuzhensheng01@baidu.com>
2026-01-31 23:41:21 -08:00
shaharmor98
f6f1b6d000 Bump FI version (#17700)
Signed-off-by: Shahar Mor <smor@nvidia.com>
Co-authored-by: b8zhong <b8zhong@uwaterloo.ca>
2026-01-26 16:50:06 +08:00
Baizhou Zhang
0dfe46dafb [Docker] Install cudnn==9.16 for cuda 13 image to avoid check error (#17668) 2026-01-24 11:27:03 +08:00
ishandhanani
1e309030e3 update urllib3 and gpgv Dockerfile (#17439) 2026-01-20 14:47:20 -08:00
b8zhong
7dc3cbe7ca [Docker] Fix CUDA 13 installing wrong nvidia-nccl-cu13 due to nixl-cu13 not breaking system package (#17370) 2026-01-20 09:52:30 +08:00
Baizhou Zhang
a04675892e Update flashinfer to 0.6.1 (#15551) 2026-01-17 00:48:30 +08:00
sglang-bot
000ad42225 chore: bump sgl-kernel version to 0.3.21 (#17075)
Co-authored-by: sglang-bot <sglang-bot@users.noreply.github.com>
2026-01-15 12:41:17 +08:00
Baizhou Zhang
9fd2358cc2 Update Cutedsl version and pin cuda-python version (#16838) 2026-01-10 17:08:43 +08:00
Shangming Cai
4b14f622e1 [CI] Add PD Disaggregation aarch64 test (#16572)
Signed-off-by: Shangming Cai <csmthu@gmail.com>
2026-01-10 14:44:54 +08:00
Shangming Cai
0c4e155a3c chore: bump mooncake version to 0.3.8.post1 (#16792)
Signed-off-by: Shangming Cai <csmthu@gmail.com>
2026-01-09 18:42:27 +08:00
gongwei-130
65bed8382b Add google-cloud-storage into Dockerfile (#15343) 2026-01-07 16:54:45 -08:00
Baizhou Zhang
6ffe1fc02f [Fix]Pin mooncake version to 0.3.7.post2 in grace blackwell (#16502) 2026-01-06 14:11:40 +08:00
Liangsheng Yin
229938805f Make personal configs optional in SGLang's official docker image. (#16365) 2026-01-04 12:27:13 +08:00
Kangyan-Zhou
9c4eb46099 Add a new branch cut GH workflow, and adopt setuptools-scm for version control (#15985) 2025-12-29 13:51:21 -08:00
Shangming Cai
41addd2e08 chore: bump mooncake version to 0.3.8 (#15886)
Signed-off-by: Shangming Cai <csmthu@gmail.com>
2025-12-27 23:34:24 +08:00
sglang-bot
34013d9d5a chore: bump sgl-kernel version to 0.3.20 (#15590)
Co-authored-by: sglang-bot <sglang-bot@users.noreply.github.com>
2025-12-22 12:32:34 -08:00
Yineng Zhang
0861dca81f Revert "[misc] Upgrade cutedsl to 4.3.1 (#14857)" (#15293) 2025-12-16 16:31:32 -08:00
Baizhou Zhang
0261c4aff7 [misc] Upgrade cutedsl to 4.3.1 (#14857) 2025-12-16 12:11:56 -08:00
sglang-bot
5c8bd8b51b chore: bump SGLang version to 0.5.6.post2 (#14858)
Co-authored-by: sglang-bot <sglang-bot@users.noreply.github.com>
2025-12-11 12:29:52 -08:00
Binyao Jiang
ef3f8c97e1 Add ffmpeg into sglang docker - required by transformers multimodal V… (#14679) 2025-12-08 18:00:23 -08:00
sglang-bot
9a327bdfcf chore: bump SGLang version to 0.5.6.post1 (#14651) 2025-12-09 00:35:28 +08:00
sglang-bot
2de98010b5 chore: bump sgl-kernel version to 0.3.19 (#14649) 2025-12-08 22:53:08 +08:00
Xiaoyu Zhang
e5135b73f4 Add CUDA kernel size analysis tool for sgl-kernel optimization (#14544) 2025-12-07 15:29:41 +08:00
sglang-bot
d2b42477c7 chore: bump sgl-kernel version to 0.3.18.post3 (#14518) 2025-12-06 13:15:16 -08:00
Simo Lin
49dfa1d891 [model-gateway] change sgl-router to sgl-model-gateway (#14312) 2025-12-05 12:04:48 -08:00
ishandhanani
498ea41ca6 dockerfile: add runtime stage + ubuntu 24.04 (#13861) 2025-12-05 00:28:36 -08:00