Binyao Jiang
|
de430b6745
|
[Performance] Replace preprocess_video logic from GLM multimodal processor with transformer impl for speed up (up to 27% faster) and addressing OOM (up to 50x improvements) (#13487)
|
2025-11-24 18:17:13 -08:00 |
|
Zhi Yiliu
|
a95a38078b
|
[Fix] Fix uvloop get_event_loop() is not suitable for 0.22.x (#13612)
Signed-off-by: lzy <tomlzy213@gmail.com>
Co-authored-by: lzy <tomlzy213@gmail.com>
|
2025-11-25 01:20:00 +08:00 |
|
Baizhou Zhang
|
04b52fa8d6
|
[chore]Upgrade flashinfer to 0.5.3 (#13751)
|
2025-11-23 23:38:36 -08:00 |
|
Yuan Luo
|
f56b9b42e6
|
[Bugfix] Add jit kernel files in packaging (#13829)
Co-authored-by: luoyuan.luo <luoyuan.luo@antgroup.com>
Co-authored-by: Xu Yongfei <xuyongfei.xyf@antgroup.com>
|
2025-11-24 12:32:16 +08:00 |
|
Swipe4057
|
d5e0346847
|
xgrammar up version to 0.1.27 (#13650)
|
2025-11-24 10:53:45 +08:00 |
|
sglang-bot
|
bfaf0b8607
|
chore: bump sgl-kernel version to 0.3.17.post2 (#13570)
|
2025-11-19 14:02:57 -08:00 |
|
sglang-bot
|
7b2fb3d47c
|
chore: bump SGLang version to 0.5.5.post3 (#13366)
|
2025-11-16 17:55:38 -08:00 |
|
b8zhong
|
d5fa58c4dd
|
fix nightly docker build (#13386)
|
2025-11-16 11:21:09 -08:00 |
|
sglang-bot
|
1ca205f6da
|
chore: bump sgl-kernel version to 0.3.17.post1 (#13358)
|
2025-11-15 19:11:41 -08:00 |
|
Yineng Zhang
|
f8d3d80f63
|
chore: bump flashinfer v0.5.2 (#13242)
|
2025-11-14 02:47:09 -08:00 |
|
sglang-bot
|
ebaf86d441
|
chore: bump SGLang version to 0.5.5.post2 (#13129)
Include the critical fix https://github.com/sgl-project/sglang/pull/12915.
|
2025-11-12 20:35:20 +08:00 |
|
sglang-bot
|
303cc957e6
|
chore: bump SGLang version to 0.5.5.post1 (#13000)
|
2025-11-10 11:53:43 -08:00 |
|
sglang-bot
|
37c40a87a8
|
chore: bump sgl-kernel version to 0.3.17 (#12966)
|
2025-11-10 21:50:58 +08:00 |
|
R0CKSTAR
|
b07c5e4080
|
Pin uvloop to 0.21.0 (#12279)
Signed-off-by: Xiaodong Ye <yeahdongcn@gmail.com>
|
2025-11-07 03:33:31 +08:00 |
|
sglang-bot
|
0c006b8809
|
chore: bump SGLang version to 0.5.5 (#12739)
|
2025-11-07 00:46:19 +08:00 |
|
gongwei-130
|
97be66c358
|
fix sgl-kernel version (#12723)
|
2025-11-05 19:01:03 -08:00 |
|
Mick
|
7bc1dae095
|
WIP: initial multimodal-gen support (#12484)
Co-authored-by: yhyang201 <yhyang201@gmail.com>
Co-authored-by: yizhang2077 <1109276519@qq.com>
Co-authored-by: Xinyuan Tong <xinyuantong.cs@gmail.com>
Co-authored-by: ispobock <ispobaoke@gmail.com>
Co-authored-by: JiLi <leege233@gmail.com>
Co-authored-by: CHEN Xi <78632976+RubiaCx@users.noreply.github.com>
Co-authored-by: laixin <xielx@shanghaitech.edu.cn>
Co-authored-by: SolitaryThinker <wlsaidhi@gmail.com>
Co-authored-by: jzhang38 <a1286225768@gmail.com>
Co-authored-by: BrianChen1129 <yongqichcd@gmail.com>
Co-authored-by: Kevin Lin <42618777+kevin314@users.noreply.github.com>
Co-authored-by: Edenzzzz <wtan45@wisc.edu>
Co-authored-by: rlsu9 <r3su@ucsd.edu>
Co-authored-by: Jinzhe Pan <48981407+eigensystem@users.noreply.github.com>
Co-authored-by: foreverpiano <pianoqwz@qq.com>
Co-authored-by: RandNMR73 <notomatthew31@gmail.com>
Co-authored-by: PorridgeSwim <yz3883@columbia.edu>
Co-authored-by: Jiali Chen <90408393+gary-chenjl@users.noreply.github.com>
|
2025-11-05 12:28:52 -08:00 |
|
sglang-bot
|
09938e1f82
|
chore: bump SGLang version to 0.5.4.post3 (#12639)
|
2025-11-04 18:32:11 -08:00 |
|
Baizhou Zhang
|
6e29446e45
|
[hotfix] Remove flashinfer-jit-cache from pyproject (#12530)
|
2025-11-02 22:11:05 -08:00 |
|
Yineng Zhang
|
0c3543d7d5
|
chore: upgrade flashinfer 0.5.0 (#12523)
Co-authored-by: Baizhou Zhang <sobereddiezhang@gmail.com>
|
2025-11-02 20:54:12 -08:00 |
|
sglang-bot
|
41c10e67fc
|
chore: bump SGLang version to 0.5.4.post2 (#12439)
|
2025-10-31 17:38:50 -07:00 |
|
Baizhou Zhang
|
587deb15a7
|
[hotfix] Fix pytest not found in CI (#12311)
|
2025-10-29 11:07:36 +08:00 |
|
ishandhanani
|
285a8e6986
|
docker: add CUDA13 support in dockerfile and update GDRCopy/NVSHMEM for blackwell support (#11517)
Co-authored-by: Baizhou Zhang <sobereddiezhang@gmail.com>
|
2025-10-27 22:00:54 -07:00 |
|
Xinyuan Tong
|
729f612dc6
|
Update openai package version to 2.6.1 (#12222)
|
2025-10-28 11:23:40 +08:00 |
|
sglang-bot
|
55d75e11bd
|
chore: bump SGLang version to 0.5.4.post1 (#12169)
|
2025-10-27 09:35:20 +08:00 |
|
Liangsheng Yin
|
8491c794ad
|
[misc] depdencies & enviroment flag (#12113)
|
2025-10-26 14:52:35 +08:00 |
|
Baizhou Zhang
|
4b0ac1d52a
|
Update sgl-kernel version to 0.3.16.post4 (#12125)
|
2025-10-25 14:33:33 -07:00 |
|
Muqi Li
|
b04cd3d487
|
Add 'gguf' to project dependencies (#12046)
|
2025-10-24 17:16:19 +08:00 |
|
sglang-bot
|
1053e1be17
|
chore: bump SGLang version to 0.5.4 (#12027)
Co-authored-by: sglang-bot <sglang-bot@users.noreply.github.com>
|
2025-10-23 18:01:40 -07:00 |
|
Teng Ma
|
96a5e4dd79
|
[Feature] Support loading weights from ckpt engine worker (#11755)
Signed-off-by: Yang Kaiyong <yangkaiyong.yky@antgroup.com>
Signed-off-by: Cruz Zhao <CruzZhao@linux.alibaba.com>
Signed-off-by: Xuchun Shang <xuchun.shang@gmail.com>
Co-authored-by: Yang Kaiyong <yangkaiyong.yky@antgroup.com>
Co-authored-by: Cruz Zhao <CruzZhao@linux.alibaba.com>
Co-authored-by: Xuchun Shang <xuchun.shang@gmail.com>
Co-authored-by: Shangming Cai <csmthu@gmail.com>
|
2025-10-23 09:23:30 -07:00 |
|
Chang Su
|
6ade6a02d4
|
[grpc] Support gRPC standard health check (#11955)
|
2025-10-22 16:59:09 -07:00 |
|
Zhiyu
|
80b2b3207a
|
Enable native ModelOpt quantization support (3/3) (#10154)
Signed-off-by: Zhiyu Cheng <zhiyuc@nvidia.com>
|
2025-10-21 21:44:29 -07:00 |
|
Yineng Zhang
|
9792b9d7e3
|
chore: upgrade flashinfer 0.4.1 (#11933)
|
2025-10-21 14:46:31 -07:00 |
|
Baizhou Zhang
|
ebff4ee648
|
Update sgl-kernel and remove fast hadamard depedency (#11844)
|
2025-10-21 13:13:54 -07:00 |
|
fzyzcjy
|
a7043c6f0d
|
Bump torch_memory_saver to avoid installing pre-release versions (#11797)
|
2025-10-18 01:20:42 -07:00 |
|
Lianmin Zheng
|
67e34c56d7
|
Fix install instructions and pyproject.tomls (#11781)
|
2025-10-18 01:08:01 -07:00 |
|
sglang-bot
|
85ebeecf06
|
chore: bump SGLang version to 0.5.3.post3 (#11693)
Co-authored-by: sglang-bot <sglang-bot@users.noreply.github.com>
|
2025-10-16 13:14:55 -07:00 |
|
sglang-bot
|
baf277a9bf
|
chore: bump SGLang version to 0.5.3.post2 (#11680)
Co-authored-by: sglang-bot <sglang-bot@users.noreply.github.com>
|
2025-10-15 16:49:14 -07:00 |
|
Sahithi Chigurupati
|
e9e120ac7a
|
fix: upgrade transformers to 4.57.1 (#11628)
Signed-off-by: Sahithi Chigurupati <chigurupati.sahithi@gmail.com>
Co-authored-by: zhyncs <me@zhyncs.com>
|
2025-10-14 18:35:05 -07:00 |
|
Johnny
|
cb8f3d90d3
|
[NVIDIA] update pyproject.toml to support cu130 option (#11521)
|
2025-10-13 13:03:31 -07:00 |
|
ai-jz
|
9cc1e065f1
|
[router][Fix] Include grpc reflection runtime dependency (#11419)
Co-authored-by: Chang Su <chang.s.su@oracle.com>
|
2025-10-13 09:32:42 -07:00 |
|
Lianmin Zheng
|
548a57b1f3
|
Fix port conflicts in CI (#11497)
|
2025-10-12 06:46:36 -07:00 |
|
sglang-bot
|
758b887ad1
|
chore: bump SGLang version to 0.5.3.post1 (#11324)
|
2025-10-09 15:19:59 -07:00 |
|
Yineng Zhang
|
44cb060785
|
chore: upgrade flashinfer 0.4.0 (#11364)
|
2025-10-09 14:17:54 -07:00 |
|
Lifu Huang
|
edefab0c64
|
[2/2] Support MHA prefill with FlashAttention 4. (#10937)
Co-authored-by: Hieu Pham <hyhieu@gmail.com>
|
2025-10-08 00:54:20 -07:00 |
|
DarkSharpness
|
832c84fba9
|
[Chore] Update xgrammar 0.1.24 -> 0.1.25 (#10710)
|
2025-10-07 18:22:28 -07:00 |
|
sglang-bot
|
a4a3d82393
|
chore: bump SGLang version to 0.5.3 (#11263)
|
2025-10-06 20:07:02 +08:00 |
|
sglang-bot
|
0b13cbb7c9
|
chore: bump SGLang version to 0.5.3rc2 (#11259)
Co-authored-by: sglang-bot <sglang-bot@users.noreply.github.com>
|
2025-10-06 01:12:10 -07:00 |
|
Lianmin Zheng
|
f8924ad74b
|
update sgl kernel version to 0.3.14.post1 (#11242)
|
2025-10-05 20:30:40 -07:00 |
|
fzyzcjy
|
2f80bd9f0e
|
Bump torch_memory_saver 0.0.9rc2 (#11252)
|
2025-10-05 20:26:20 -07:00 |
|