Baizhou Zhang
|
b441317aa4
|
Revert "Upgrade CI default CUDA version from 12.9 to 13.0" (#22727)
|
2026-04-13 14:39:24 -07:00 |
|
Asish Kumar
|
39810762d2
|
fix: use describe mode for SGLang version detection (#22600)
Signed-off-by: Asish Kumar <officialasishkumar@gmail.com>
|
2026-04-13 09:45:45 -07:00 |
|
Alison Shao
|
3f4fbc165d
|
Upgrade CI default CUDA version from 12.9 to 13.0 (#21441)
|
2026-04-12 21:48:40 -07:00 |
|
sglang-bot
|
df3275bd6c
|
chore: bump flashinfer version to 0.6.7.post3 (#22382)
Co-authored-by: sglang-bot <sglang-bot@users.noreply.github.com>
|
2026-04-08 14:49:45 -07:00 |
|
Rain Jiang
|
1a8eb890f6
|
Kernels community fa3 (#20796)
|
2026-04-07 12:48:44 -07:00 |
|
Ke Bao
|
be42fbbbd7
|
Support HTTP2 server (#21700)
|
2026-04-08 00:42:52 +08:00 |
|
Kangyan-Zhou
|
93109cc89b
|
[Fix] Fix setuptools-scm version resolution for rc tags (#22165)
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Baizhou Zhang <sobereddiezhang@gmail.com>
|
2026-04-05 16:55:32 -07:00 |
|
sglang-bot
|
46bf19cdab
|
chore: bump flashinfer version to 0.6.7.post2 (#22097)
Co-authored-by: sglang-bot <sglang-bot@users.noreply.github.com>
|
2026-04-04 02:16:25 -07:00 |
|
sglang-bot
|
84118acf50
|
chore: bump sglang-kernel version to 0.4.1 (#22009)
Co-authored-by: sglang-bot <sglang-bot@users.noreply.github.com>
|
2026-04-03 13:58:35 -07:00 |
|
Noa Neria
|
8d9145d97e
|
Direct model loading from object storage with Runai Model Streamer (#17948)
Signed-off-by: Noa Neria <noa@run.ai>
|
2026-04-01 18:41:22 -07:00 |
|
Alison Shao
|
1ac74e652e
|
[Misc] Fix comparator e2e tests: add polars dep + fix dp-attention test (#21804)
Co-authored-by: Alison Shao <alison.shao@mac.lan>
|
2026-04-01 15:44:35 -07:00 |
|
sglang-bot
|
ca3ba05a7a
|
chore: bump flashinfer version to 0.6.7 (#21422)
Co-authored-by: sglang-bot <sglang-bot@users.noreply.github.com>
Co-authored-by: Baizhou Zhang <sobereddiezhang@gmail.com>
|
2026-03-31 21:18:16 -07:00 |
|
Anant Sharma
|
f289d173aa
|
[Deps] Bump xgrammar to 0.1.32 (#21032)
|
2026-03-26 01:22:37 -07:00 |
|
Alison Shao
|
5297a3cb46
|
[CI] Rewrite killall_sglang as Python with CI/local dual mode (#21331)
Co-authored-by: Alison Shao <alison.shao@mac.lan>
Co-authored-by: Alison Shao <alison.shao@MacBook-Pro-D2W773R9CD.local>
Co-authored-by: hnyls2002 <lsyincs@gmail.com>
Co-authored-by: Liangsheng Yin <hnyls2002@gmail.com>
|
2026-03-24 23:54:01 -07:00 |
|
Xinyuan Tong
|
d1e95af282
|
Upgrade transformers==5.3.0 (#17784)
Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>
Co-authored-by: Kangyan-Zhou <zky314343421@gmail.com>
Co-authored-by: Alison Shao <alisonshao@mac.lan>
Co-authored-by: Mick <mickjagger19@icloud.com>
|
2026-03-18 13:50:43 -07:00 |
|
Rain Jiang
|
cb1e63aba4
|
bump fa4 to official released fa4 pkg (#20303)
|
2026-03-17 17:22:56 -07:00 |
|
DefTruth
|
025691cd9e
|
[diffusion] chore: bump up cache-dit & support quant for diffusers backend (#20361)
|
2026-03-17 12:51:31 +08:00 |
|
Xiaoyu Zhang
|
15097c5c3b
|
Release sglang kernel 0.4.0 (#20440)
Co-authored-by: Baizhou Zhang <sobereddiezhang@gmail.com>
|
2026-03-16 20:34:58 +08:00 |
|
Ke Bao
|
e2be31824f
|
[CI] Add ut coverage tool (#20628)
|
2026-03-15 21:13:45 +08:00 |
|
sglang-bot
|
93afe15b43
|
chore: bump flashinfer version to 0.6.6 (#20480)
Co-authored-by: sglang-bot <sglang-bot@users.noreply.github.com>
|
2026-03-14 13:05:10 -07:00 |
|
Simo Lin
|
654fc02cf1
|
[gRPC] Extract gRPC servicer into standalone package (#20478)
Signed-off-by: Simo Lin <linsimo.mark@gmail.com>
|
2026-03-13 09:13:29 -07:00 |
|
Yuhao Yang
|
a57a44739f
|
[diffusion] deps: upgrade diffusers from 0.36.0 to 0.37.0 (#20318)
|
2026-03-12 19:17:28 +08:00 |
|
Rain Jiang
|
61b228239e
|
bump sgl-fa4 version to 4.0.5 to loose torch deps (#20378)
|
2026-03-11 13:08:09 -07:00 |
|
Xiaoyu Zhang
|
680d9d98e4
|
Fix cutedsl ci error (#20309)
Co-authored-by: Baizhou Zhang <sobereddiezhang@gmail.com>
|
2026-03-11 16:17:35 +08:00 |
|
Xinyuan Tong
|
4a757990a1
|
[VLM] Replace decord with torchcodec for video decoding (#20055)
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: BakerBunker <17872844+BakerBunker@users.noreply.github.com>
|
2026-03-09 19:23:49 +08:00 |
|
Xinyu Zhang
|
b3cfad0a80
|
Add Ray actor support for scheduler process management (DP=1) (#17684)
Co-authored-by: Cursor <cursoragent@cursor.com>
|
2026-03-05 13:21:23 -08:00 |
|
Rain Jiang
|
472eef4071
|
fa4 cleanup (#19727)
|
2026-03-05 17:54:25 +08:00 |
|
Kangyan-Zhou
|
198381d9ce
|
Add SSL/TLS support for HTTP and gRPC servers (#18973)
Co-authored-by: guys@spotify.com
|
2026-03-04 19:27:16 -08:00 |
|
Jasonzhang517
|
d939e26585
|
[model gateway][0/N] router EPD support: add encoder grpc server backend support (#16552)
Co-authored-by: Zongyao Chen <ZongYao.Chen@linux.alibaba.com>
Co-authored-by: Zongyao Chen <solar1s@163.com>
|
2026-03-03 19:38:15 +08:00 |
|
Mohammad Miadh Angkad
|
6822941514
|
[FlashInfer] Bump FlashInfer version from 0.6.3 to 0.6.4 (#19005)
|
2026-03-02 16:12:09 -08:00 |
|
Prozac614
|
57c5c343d7
|
[diffusion] model: support Hunyuan3D-2 (#18170)
Co-authored-by: yingluosanqian <yingluosanqian@gmail.com>
Co-authored-by: daiweitao <dwti614707404@163.com>
Co-authored-by: Mick <mickjagger19@icloud.com>
|
2026-03-02 12:28:05 +08:00 |
|
DefTruth
|
78d6674c45
|
[diffusion] feat: support hybrid parallelism for diffusers backend (#19405)
|
2026-02-27 00:06:08 +08:00 |
|
Mick
|
241ee90164
|
[diffusion] chore: tiny fix pyproject.toml (#19256)
|
2026-02-25 11:57:53 +08:00 |
|
GMI Xiao Jin
|
fcfd964d7d
|
[diffusion] model: LTX-2 Support PR3 (#19151)
|
2026-02-24 16:55:28 +08:00 |
|
Mohammad Miadh Angkad
|
1be41e9036
|
[FlashInfer] Bump FlashInfer version from 0.6.2 to 0.6.3 (#18448)
|
2026-02-14 07:43:33 +08:00 |
|
Simo Lin
|
92c5749f41
|
refactor: replace local proto compilation with smg-grpc-proto package (#18682)
|
2026-02-12 05:29:24 -08:00 |
|
shaharmor98
|
c6aa1863be
|
Add Nemotron 3 Nano tests (#18119)
Signed-off-by: Shahar Mor <smor@nvidia.com>
|
2026-02-06 23:55:42 +08:00 |
|
linhaifeng
|
c1d5cc3b24
|
[Bugfix] fix a obvious logic error (#18254)
|
2026-02-04 13:59:58 -08:00 |
|
Mick
|
977096ae03
|
[diffusion] cli: introduce generic attention backend configuration in ServerArgs (#18036)
|
2026-02-02 09:47:40 +08:00 |
|
Baizhou Zhang
|
c7d53fa26a
|
Set torch url index in pyproject.toml (#16802)
|
2026-02-01 13:23:52 +08:00 |
|
Prozac614
|
3fcda00e8c
|
[CI] Fix CI timeouts by upgrading runai_model_streamer (related to #16937) (#17636)
|
2026-01-28 17:09:45 -08:00 |
|
shaharmor98
|
f6f1b6d000
|
Bump FI version (#17700)
Signed-off-by: Shahar Mor <smor@nvidia.com>
Co-authored-by: b8zhong <b8zhong@uwaterloo.ca>
|
2026-01-26 16:50:06 +08:00 |
|
Kangyan-Zhou
|
48f4340b14
|
Exclude some diffusion package for ARM in docker release (#17745)
|
2026-01-25 23:32:39 -08:00 |
|
Kangyan-Zhou
|
8d3e1ac0c8
|
Add an all type in pyproject.tml to include diffusion support (#17697)
|
2026-01-25 12:52:13 -08:00 |
|
Chi McIsaac
|
71482dd171
|
[diffusion] feat: enable passing Cache‑DiT config for diffusers backend (#16662)
Signed-off-by: Chi <chixie.mcisaac@gmail.com>
Signed-off-by: qimcis <chixie.mcisaac@gmail.com>
Co-authored-by: Mick <mickjagger19@icloud.com>
|
2026-01-22 13:13:34 +08:00 |
|
Baizhou Zhang
|
fafa171529
|
[hotfix] Fixes on cuda 13 docker image (#17541)
Co-authored-by: iforgetmyname <iforgetmyname@users.noreply.github>
|
2026-01-22 12:29:55 +08:00 |
|
Lianmin Zheng
|
b74a57a8d9
|
[Auto Sync] Update detokenizer_manager.py, io_struct.py, mu... (20260120) (#17442)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Wangfan Fu <wangfan@x.ai>
|
2026-01-21 14:48:32 -08:00 |
|
DarkSharpness
|
95f59c13fd
|
[Chore] include all jit files in building packages (#17493)
|
2026-01-21 14:48:02 -08:00 |
|
Jacob Gordon
|
cda43ffa4d
|
ci: avoids duplication of codespell config (#17519)
|
2026-01-21 12:02:37 -08:00 |
|
Baizhou Zhang
|
ea879c7739
|
[Minor] Correct sglang version when installing from source (#17315)
|
2026-01-18 19:36:16 -08:00 |
|