Liangsheng Yin
|
35870d55ac
|
Deepseek V4 (#23882)
Co-authored-by: Baizhou Zhang <sobereddiezhang@gmail.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: fzyzcjy <ch271828n@outlook.com>
Co-authored-by: ispobock <ispobaoke@gmail.com>
Co-authored-by: Zhiqiang Xie <xiezhq@stanford.edu>
Co-authored-by: yueming-yuan <yym022502@gmail.com>
Co-authored-by: DarkSharpness <2040703891@qq.com>
Co-authored-by: Yuhao Yang <47235274+yhyang201@users.noreply.github.com>
Co-authored-by: yhyang201 <yhyang201@users.noreply.github.com>
Co-authored-by: yhyang201 <yhyang201@gmail.com>
Co-authored-by: Qiaolin Yu <90088090+qiaolin-yu@users.noreply.github.com>
Co-authored-by: Ethan (Yusheng) Su <11704492+yushengsu-thu@users.noreply.github.com>
Co-authored-by: Mingyi <27337995+wisclmy0611@users.noreply.github.com>
Co-authored-by: Cheng Wan <54331508+ch-wan@users.noreply.github.com>
Co-authored-by: Yihao Wang <42559837+againstentropy@users.noreply.github.com>
|
2026-05-07 18:32:21 -07:00 |
|
Baizhou Zhang
|
ecb786c8d7
|
[Kernel] Deprecate DeepGemm in sgl kernel and apply custom wheel sgl-deep-gemm (#24268)
|
2026-05-06 18:59:01 -07:00 |
|
Linzhang Li
|
952b3caf18
|
feat: use structural tags to enable strict tool calling and reasoning for more models (#21722)
Signed-off-by: Yuchuan <yuchuan.7streams@gmail.com>
Co-authored-by: Xinyuan Tong <115166877+JustinTong0323@users.noreply.github.com>
Co-authored-by: Ubospica <ubospica@gmail.com>
Co-authored-by: Xinyuan Tong <xinyuantong.cs@gmail.com>
|
2026-05-04 02:30:28 -07:00 |
|
Brayden Zhong
|
88bb5dffe4
|
[Dependency] Upgrade to Torch 2.11.0 (#21247)
Co-authored-by: Kangyan Zhou <zky314343421@gmail.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Baizhou Zhang <sobereddiezhang@gmail.com>
Co-authored-by: b8zhong <b8zhong@users.noreply.github.com>
Co-authored-by: Mick <mickjagger19@icloud.com>
|
2026-05-02 12:25:36 -07:00 |
|
Kangyan-Zhou
|
cd27baaffd
|
[ci][cu13] Bump torch_memory_saver to 0.0.9.post1; restore manual tests (#23182)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
2026-05-01 22:50:38 -07:00 |
|
AlonKejzman
|
66ea0aee7f
|
tokenizer: Add fastokens support (#23753)
|
2026-04-28 11:43:10 -07:00 |
|
Xinyuan Tong
|
e5198386bd
|
Upgrade transformers from 5.5.4 to 5.6.0 (#23525)
|
2026-04-26 22:33:54 -07:00 |
|
sglang-bot
|
9003f24e2b
|
chore: bump sglang-kernel version to 0.4.1.post1 (#23733)
Co-authored-by: sglang-bot <sglang-bot@users.noreply.github.com>
Co-authored-by: Kangyan Zhou <zky314343421@gmail.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
2026-04-25 23:23:49 -07:00 |
|
sglang-bot
|
f3b88e080a
|
chore: bump flashinfer version to 0.6.8.post1 (#23281)
Co-authored-by: sglang-bot <sglang-bot@users.noreply.github.com>
|
2026-04-23 15:23:03 -07:00 |
|
Alex Nails
|
10e17cc55e
|
[gRPC] Native gRPC server: proto + Rust crate scaffold + server args (#22736)
|
2026-04-20 12:39:35 +08:00 |
|
Baizhou Zhang
|
6ecd6f84db
|
[CI] Add per-job uv venv isolation and upgrade CI version to Cuda 13 (#23119)
Co-authored-by: Kangyan Zhou <zky314343421@gmail.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Alison Shao <a.shao@wustl.edu>
Co-authored-by: Mick <mickjagger19@icloud.com>
|
2026-04-19 05:32:36 -07:00 |
|
Xinyuan Tong
|
34fef07a15
|
Upgrade transformers to 5.5.3 and refactor hf_transformers_utils into subpackage (#21569)
|
2026-04-15 20:03:44 -07:00 |
|
Baizhou Zhang
|
b441317aa4
|
Revert "Upgrade CI default CUDA version from 12.9 to 13.0" (#22727)
|
2026-04-13 14:39:24 -07:00 |
|
Asish Kumar
|
39810762d2
|
fix: use describe mode for SGLang version detection (#22600)
Signed-off-by: Asish Kumar <officialasishkumar@gmail.com>
|
2026-04-13 09:45:45 -07:00 |
|
Alison Shao
|
3f4fbc165d
|
Upgrade CI default CUDA version from 12.9 to 13.0 (#21441)
|
2026-04-12 21:48:40 -07:00 |
|
sglang-bot
|
df3275bd6c
|
chore: bump flashinfer version to 0.6.7.post3 (#22382)
Co-authored-by: sglang-bot <sglang-bot@users.noreply.github.com>
|
2026-04-08 14:49:45 -07:00 |
|
Rain Jiang
|
1a8eb890f6
|
Kernels community fa3 (#20796)
|
2026-04-07 12:48:44 -07:00 |
|
Ke Bao
|
be42fbbbd7
|
Support HTTP2 server (#21700)
|
2026-04-08 00:42:52 +08:00 |
|
Kangyan-Zhou
|
93109cc89b
|
[Fix] Fix setuptools-scm version resolution for rc tags (#22165)
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Baizhou Zhang <sobereddiezhang@gmail.com>
|
2026-04-05 16:55:32 -07:00 |
|
sglang-bot
|
46bf19cdab
|
chore: bump flashinfer version to 0.6.7.post2 (#22097)
Co-authored-by: sglang-bot <sglang-bot@users.noreply.github.com>
|
2026-04-04 02:16:25 -07:00 |
|
sglang-bot
|
84118acf50
|
chore: bump sglang-kernel version to 0.4.1 (#22009)
Co-authored-by: sglang-bot <sglang-bot@users.noreply.github.com>
|
2026-04-03 13:58:35 -07:00 |
|
Noa Neria
|
8d9145d97e
|
Direct model loading from object storage with Runai Model Streamer (#17948)
Signed-off-by: Noa Neria <noa@run.ai>
|
2026-04-01 18:41:22 -07:00 |
|
Alison Shao
|
1ac74e652e
|
[Misc] Fix comparator e2e tests: add polars dep + fix dp-attention test (#21804)
Co-authored-by: Alison Shao <alison.shao@mac.lan>
|
2026-04-01 15:44:35 -07:00 |
|
sglang-bot
|
ca3ba05a7a
|
chore: bump flashinfer version to 0.6.7 (#21422)
Co-authored-by: sglang-bot <sglang-bot@users.noreply.github.com>
Co-authored-by: Baizhou Zhang <sobereddiezhang@gmail.com>
|
2026-03-31 21:18:16 -07:00 |
|
Anant Sharma
|
f289d173aa
|
[Deps] Bump xgrammar to 0.1.32 (#21032)
|
2026-03-26 01:22:37 -07:00 |
|
Alison Shao
|
5297a3cb46
|
[CI] Rewrite killall_sglang as Python with CI/local dual mode (#21331)
Co-authored-by: Alison Shao <alison.shao@mac.lan>
Co-authored-by: Alison Shao <alison.shao@MacBook-Pro-D2W773R9CD.local>
Co-authored-by: hnyls2002 <lsyincs@gmail.com>
Co-authored-by: Liangsheng Yin <hnyls2002@gmail.com>
|
2026-03-24 23:54:01 -07:00 |
|
Xinyuan Tong
|
d1e95af282
|
Upgrade transformers==5.3.0 (#17784)
Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>
Co-authored-by: Kangyan-Zhou <zky314343421@gmail.com>
Co-authored-by: Alison Shao <alisonshao@mac.lan>
Co-authored-by: Mick <mickjagger19@icloud.com>
|
2026-03-18 13:50:43 -07:00 |
|
Rain Jiang
|
cb1e63aba4
|
bump fa4 to official released fa4 pkg (#20303)
|
2026-03-17 17:22:56 -07:00 |
|
DefTruth
|
025691cd9e
|
[diffusion] chore: bump up cache-dit & support quant for diffusers backend (#20361)
|
2026-03-17 12:51:31 +08:00 |
|
Xiaoyu Zhang
|
15097c5c3b
|
Release sglang kernel 0.4.0 (#20440)
Co-authored-by: Baizhou Zhang <sobereddiezhang@gmail.com>
|
2026-03-16 20:34:58 +08:00 |
|
Ke Bao
|
e2be31824f
|
[CI] Add ut coverage tool (#20628)
|
2026-03-15 21:13:45 +08:00 |
|
sglang-bot
|
93afe15b43
|
chore: bump flashinfer version to 0.6.6 (#20480)
Co-authored-by: sglang-bot <sglang-bot@users.noreply.github.com>
|
2026-03-14 13:05:10 -07:00 |
|
Simo Lin
|
654fc02cf1
|
[gRPC] Extract gRPC servicer into standalone package (#20478)
Signed-off-by: Simo Lin <linsimo.mark@gmail.com>
|
2026-03-13 09:13:29 -07:00 |
|
Yuhao Yang
|
a57a44739f
|
[diffusion] deps: upgrade diffusers from 0.36.0 to 0.37.0 (#20318)
|
2026-03-12 19:17:28 +08:00 |
|
Rain Jiang
|
61b228239e
|
bump sgl-fa4 version to 4.0.5 to loose torch deps (#20378)
|
2026-03-11 13:08:09 -07:00 |
|
Xiaoyu Zhang
|
680d9d98e4
|
Fix cutedsl ci error (#20309)
Co-authored-by: Baizhou Zhang <sobereddiezhang@gmail.com>
|
2026-03-11 16:17:35 +08:00 |
|
Xinyuan Tong
|
4a757990a1
|
[VLM] Replace decord with torchcodec for video decoding (#20055)
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: BakerBunker <17872844+BakerBunker@users.noreply.github.com>
|
2026-03-09 19:23:49 +08:00 |
|
Xinyu Zhang
|
b3cfad0a80
|
Add Ray actor support for scheduler process management (DP=1) (#17684)
Co-authored-by: Cursor <cursoragent@cursor.com>
|
2026-03-05 13:21:23 -08:00 |
|
Rain Jiang
|
472eef4071
|
fa4 cleanup (#19727)
|
2026-03-05 17:54:25 +08:00 |
|
Kangyan-Zhou
|
198381d9ce
|
Add SSL/TLS support for HTTP and gRPC servers (#18973)
Co-authored-by: guys@spotify.com
|
2026-03-04 19:27:16 -08:00 |
|
Jasonzhang517
|
d939e26585
|
[model gateway][0/N] router EPD support: add encoder grpc server backend support (#16552)
Co-authored-by: Zongyao Chen <ZongYao.Chen@linux.alibaba.com>
Co-authored-by: Zongyao Chen <solar1s@163.com>
|
2026-03-03 19:38:15 +08:00 |
|
Mohammad Miadh Angkad
|
6822941514
|
[FlashInfer] Bump FlashInfer version from 0.6.3 to 0.6.4 (#19005)
|
2026-03-02 16:12:09 -08:00 |
|
Prozac614
|
57c5c343d7
|
[diffusion] model: support Hunyuan3D-2 (#18170)
Co-authored-by: yingluosanqian <yingluosanqian@gmail.com>
Co-authored-by: daiweitao <dwti614707404@163.com>
Co-authored-by: Mick <mickjagger19@icloud.com>
|
2026-03-02 12:28:05 +08:00 |
|
DefTruth
|
78d6674c45
|
[diffusion] feat: support hybrid parallelism for diffusers backend (#19405)
|
2026-02-27 00:06:08 +08:00 |
|
Mick
|
241ee90164
|
[diffusion] chore: tiny fix pyproject.toml (#19256)
|
2026-02-25 11:57:53 +08:00 |
|
GMI Xiao Jin
|
fcfd964d7d
|
[diffusion] model: LTX-2 Support PR3 (#19151)
|
2026-02-24 16:55:28 +08:00 |
|
Mohammad Miadh Angkad
|
1be41e9036
|
[FlashInfer] Bump FlashInfer version from 0.6.2 to 0.6.3 (#18448)
|
2026-02-14 07:43:33 +08:00 |
|
Simo Lin
|
92c5749f41
|
refactor: replace local proto compilation with smg-grpc-proto package (#18682)
|
2026-02-12 05:29:24 -08:00 |
|
shaharmor98
|
c6aa1863be
|
Add Nemotron 3 Nano tests (#18119)
Signed-off-by: Shahar Mor <smor@nvidia.com>
|
2026-02-06 23:55:42 +08:00 |
|
linhaifeng
|
c1d5cc3b24
|
[Bugfix] fix a obvious logic error (#18254)
|
2026-02-04 13:59:58 -08:00 |
|