Commit Graph

413 Commits

Author SHA1 Message Date
HAI
0215d47007 [AMD] ROCm7.2: Add /sgl-workspace/aiter to PYTHONPATH (#18972) 2026-02-18 02:21:39 -08:00
Duyi-Wang
5ddc84e33e [AMD] MORI-EP inter kernel type switch (#18437)
Co-authored-by: HAI <hixiao@gmail.com>
2026-02-15 20:59:39 -08:00
chenxu214
4e162d4b1b change npu.dockerfile (#18835) 2026-02-15 20:43:15 +08:00
Mohammad Miadh Angkad
1be41e9036 [FlashInfer] Bump FlashInfer version from 0.6.2 to 0.6.3 (#18448) 2026-02-14 07:43:33 +08:00
HAI
f4417475b8 Build ROCm7.2 Image with latest AITER v0.1.10.post3 (#18741) 2026-02-12 14:30:13 -08:00
Thomas Wang
e20e6c28b9 [AMD] Fix accuracy issue when running TP4 dsv3 model with mtp (#18607)
Co-authored-by: YC Tseng <yctseng@amd.com>
Co-authored-by: kkHuang-amd <wunhuang@amd.com>
2026-02-12 01:13:16 -08:00
YC Tseng
20554a0a4f [AMD] rocm 7.2 image release, PR test, Nightly Test (#17799)
Co-authored-by: Alan Kao <akao@amd.com>
Co-authored-by: bingxche <Bingxu.Chen@amd.com>
Co-authored-by: Michael <13900043+michaelzhang-ai@users.noreply.github.com>
2026-02-11 21:29:25 -08:00
YC Tseng
d94d0af573 [AMD] Turn on aiter-prebuild (#18425)
Co-authored-by: bingxche <Bingxu.Chen@amd.com>
2026-02-10 00:49:35 -08:00
YC Tseng
28717e3d28 [AMD] CI - Fix AMD daily image release and install dependency (#18452)
Co-authored-by: Bingxu Chen <bingxche@amd.com>
2026-02-08 22:20:09 -08:00
Bingxu Chen
3f3c201243 [AMD] Update aiter to v0.1.10.post2 (#18423)
Co-authored-by: kkHuang-amd <wunhuang@amd.com>
Co-authored-by: YC Tseng <yctseng@amd.com>
2026-02-08 22:08:24 -08:00
Shangming Cai
52401bec1d chore: bump mooncake version to 0.3.9 (#18316)
Signed-off-by: Shangming Cai <csmthu@gmail.com>
2026-02-07 17:30:01 +08:00
ishandhanani
8f8c1724ae docker: add patch to increase GPU deepep timeout (#18298) 2026-02-05 18:26:15 +08:00
ishandhanani
0a6925639b ci: improve docker for cu13 builds (#18194) 2026-02-03 11:09:38 -08:00
Mohammad Miadh Angkad
25508d11c0 [Docker] Remove hardcoded America/Los_Angeles timezone, default to UTC (#18121) 2026-02-02 23:22:15 -08:00
YC Tseng
ea04bc1dd6 [AMD] Fix aiter version in rocm image (#18076) 2026-02-01 19:00:38 -08:00
ZhenshengWu
71babdef51 Fix CUDA 12 dependency when importing Mooncake in official CUDA 13.x image (#17540)
Co-authored-by: wuzhensheng01 <wuzhensheng01@baidu.com>
2026-01-31 23:41:21 -08:00
Zaili Wang
97593c9f41 [CPU] toml file update (#17861) 2026-01-31 13:16:06 -08:00
kk
ec76c390c9 Add ROCm + Mori docker build instructions in rocm.Dockerfile (#18018)
Co-authored-by: wunhuang <wunhuang@amd.com>
2026-01-30 19:12:07 -08:00
22dimensions
ee3058c6e8 [NPU] fix sgl-kernel-npu package url error in npu.Dockerfile (#18017)
Signed-off-by: 22dimensions <waitingwind@foxmail.com>
2026-01-31 11:06:58 +08:00
YC Tseng
b7b1ba329e [AMD] fix pip sglang version (#17950) 2026-01-29 17:19:57 -08:00
22dimensions
cfa09d311c [NPU] add support for npu x86_64 image release (#14127)
Signed-off-by: 22dimensions <waitingwind@foxmail.com>
2026-01-29 21:23:56 +08:00
RoyWang
30adf78f82 [diffusion]: align sglang diffusion AMD pyproject_other.toml diffusion dependency with pyproject.toml (#16225)
Co-authored-by: roywang <roywang@amd.com>
2026-01-29 01:50:57 -08:00
Niko Ma
cbf90d70ff [PD] Support KV transfer with MORI-IO (#14626)
Co-authored-by: cwortman-amd <cwortman@amd.com>
2026-01-28 23:22:41 -08:00
Hubert Lu
93423ff780 [AMD] Deprecate ROCm 6.3 artifacts and standardize gfx942 on ROCm 7 (#17785) 2026-01-27 15:58:49 -08:00
monkeyLoveding
d578b41bad [NPU] Adapt cann 8.5: use sfa and lightning indexer op from cann and CI update (#17615)
Co-authored-by: Kelon <kelonlu@163.com>
2026-01-27 19:03:53 +08:00
Hubert Lu
df42f4d386 [AMD] Update dsv3.2 AMD GPU docs and unify ROCm TileLang build (#17783)
Co-authored-by: wufann <715544327@qq.com>
2026-01-26 21:10:32 -08:00
Makcum888e
bba6e38ff8 [NPU] Split pyproject npu from pyproject other (#17641) 2026-01-26 09:45:44 -08:00
shaharmor98
f6f1b6d000 Bump FI version (#17700)
Signed-off-by: Shahar Mor <smor@nvidia.com>
Co-authored-by: b8zhong <b8zhong@uwaterloo.ca>
2026-01-26 16:50:06 +08:00
Baizhou Zhang
0dfe46dafb [Docker] Install cudnn==9.16 for cuda 13 image to avoid check error (#17668) 2026-01-24 11:27:03 +08:00
wufann
a921029b97 [AMD] Support ds3.2 on gfx942 platform (#17504)
Co-authored-by: Hubert Lu <55214931+hubertlu-tw@users.noreply.github.com>
2026-01-22 13:57:08 -08:00
Zaili Wang
672eb37534 [CPU][Fix CI] Solidate torch version for sgl-kernel-cpu and fix device orientation error (#17460) 2026-01-22 14:04:50 +08:00
ishandhanani
1e309030e3 update urllib3 and gpgv Dockerfile (#17439) 2026-01-20 14:47:20 -08:00
b8zhong
7dc3cbe7ca [Docker] Fix CUDA 13 installing wrong nvidia-nccl-cu13 due to nixl-cu13 not breaking system package (#17370) 2026-01-20 09:52:30 +08:00
Baizhou Zhang
a04675892e Update flashinfer to 0.6.1 (#15551) 2026-01-17 00:48:30 +08:00
R0CKSTAR
a1dd3d48ac [diffusion] hardware: support diffusion (single GPU, 3/N) (#17105)
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2026-01-16 17:01:09 +08:00
sglang-bot
000ad42225 chore: bump sgl-kernel version to 0.3.21 (#17075)
Co-authored-by: sglang-bot <sglang-bot@users.noreply.github.com>
2026-01-15 12:41:17 +08:00
Hubert Lu
8716589826 [AMD][Diffusion] support timestep embedding kernel for AMD GPUs (#16766) 2026-01-12 22:17:07 -08:00
James
ae0baefb94 [NPU] upgrade npu mf_apater plugin (#15853) 2026-01-13 09:02:10 +08:00
Baizhou Zhang
9fd2358cc2 Update Cutedsl version and pin cuda-python version (#16838) 2026-01-10 17:08:43 +08:00
Shangming Cai
4b14f622e1 [CI] Add PD Disaggregation aarch64 test (#16572)
Signed-off-by: Shangming Cai <csmthu@gmail.com>
2026-01-10 14:44:54 +08:00
Shangming Cai
0c4e155a3c chore: bump mooncake version to 0.3.8.post1 (#16792)
Signed-off-by: Shangming Cai <csmthu@gmail.com>
2026-01-09 18:42:27 +08:00
gongwei-130
65bed8382b Add google-cloud-storage into Dockerfile (#15343) 2026-01-07 16:54:45 -08:00
Thomas Wang
820e97d6c9 Upgrade aiter version (#16619) 2026-01-06 22:20:42 -08:00
Baizhou Zhang
6ffe1fc02f [Fix]Pin mooncake version to 0.3.7.post2 in grace blackwell (#16502) 2026-01-06 14:11:40 +08:00
Alison Shao
f8411ded6e ci: migrate 1-GPU model tests to test/registered/models/ (#16414) 2026-01-04 18:08:01 -08:00
Liangsheng Yin
229938805f Make personal configs optional in SGLang's official docker image. (#16365) 2026-01-04 12:27:13 +08:00
Kangyan-Zhou
9c4eb46099 Add a new branch cut GH workflow, and adopt setuptools-scm for version control (#15985) 2025-12-29 13:51:21 -08:00
Shangming Cai
41addd2e08 chore: bump mooncake version to 0.3.8 (#15886)
Signed-off-by: Shangming Cai <csmthu@gmail.com>
2025-12-27 23:34:24 +08:00
sglang-bot
34013d9d5a chore: bump sgl-kernel version to 0.3.20 (#15590)
Co-authored-by: sglang-bot <sglang-bot@users.noreply.github.com>
2025-12-22 12:32:34 -08:00
Yuzhen Zhou
4bf06635fc [diffusion] multi-platform: support diffusion on amd and fix encoder loading on MI325 (#13760)
Co-authored-by: Sabre Shao <sabre.shao@amd.com>
Co-authored-by: Yusheng (Ethan) Su <yushengsu.thu@gmail.com>
Co-authored-by: Hubert Lu <Hubert.Lu@amd.com>
Co-authored-by: xsun <sunxiao04@gmail.com>
2025-12-19 15:38:46 +08:00