Commit Graph

169 Commits

Author SHA1 Message Date
Jiaxin(Jackson) Deng
c4db64c16b Add Lychee Doc Links Check to Local and CI (#19742)
Co-authored-by: Zijie Xia <zijie_xia@icloud.com>
Co-authored-by: Zijie Xia <zijiexia@users.noreply.github.com>
Co-authored-by: zijiexia <37504505+zijiexia@users.noreply.github.com>
2026-03-24 13:48:26 -07:00
Xiaoyu Zhang
be7a0311a0 [Diffusion] Fix and validate diffusion skills benchmarking/profiling workflow (#20528) 2026-03-13 21:11:37 +08:00
Xinyuan Tong
4a757990a1 [VLM] Replace decord with torchcodec for video decoding (#20055)
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: BakerBunker <17872844+BakerBunker@users.noreply.github.com>
2026-03-09 19:23:49 +08:00
Julian Huang
a55f658835 [Misc] Normalize --host parameter to use plain hostname without scheme (#19309)
Co-authored-by: 墨楼 <huangzhilin.hzl@antgroup.com>
Co-authored-by: Liangsheng Yin <lsyincs@gmail.com>
Co-authored-by: Liangsheng Yin <hnyls2002@gmail.com>
2026-02-25 00:37:24 -08:00
SoluMilken
07a24f1a38 update pre-commit config (#18860) 2026-02-16 00:18:31 +08:00
shuwenn
3299c4f9c1 [CI] feat: add early exit to wait_for_server when process dies (#18602) 2026-02-13 16:46:09 -08:00
shuwenn
de94d793ad feat: support qwen3(-VL) rerank scoring&chat template (#16403)
Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>
Co-authored-by: Xinyuan Tong <xinyuantong.cs@gmail.com>
2026-01-15 00:45:46 +08:00
Liangsheng Yin
a435f55d18 Tiny print launch command with shlex (#16010) 2025-12-29 11:26:46 +08:00
ゆり
186a56f6e2 fix(monitoring): update Grafana dashboard metrics prefix from sglang: to sglang_ (#15758) 2025-12-24 10:43:51 -08:00
Xinyuan Tong
47cdb65a45 fix: update argument extraction in R1 chat template (#15547)
Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>
2025-12-21 09:18:49 +08:00
Yineng Zhang
ef1ab2302a [Auto Sync] Update tool_chat_template_deepseekv31.jinja (20251210) (#14837)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Jue Wang <zjuwangjue@gmail.com>
2025-12-10 10:56:24 -08:00
Lianmin Zheng
bc3d2a85af [Minor] update docs (#14212) 2025-12-01 02:33:58 -08:00
Baizhou Zhang
808b6dfdea [Minor] Fix lint (#13938) 2025-11-25 10:57:23 -08:00
Simo Lin
4852aa054c [misc] add llama3.1 chat template (#13935) 2025-11-25 09:31:54 -08:00
Liangsheng Yin
196b940aed [3/N] CI refactor: move some manually triggered tests. (#13448) 2025-11-19 23:06:53 +08:00
Kangyan-Zhou
ea89a3a0c5 Fixes validation errors for Wan-AI models which store model weights in subdirectories (#13461) 2025-11-17 15:33:02 -08:00
Sirut Buasai
a63f433b6f extend sagemaker.Dockerfile serve script to allow all sglang serve flags (#13173) 2025-11-17 13:14:17 -08:00
Mattheliu
c3bb348dad [Docs] fix dead links in multiple documentation pages (#12764) 2025-11-06 10:49:32 -08:00
Kangyan-Zhou
7e28c67d19 Fix DeepSeek chat templates to handle tool call arguments type checking (#11700) (#12123) 2025-10-30 16:39:25 +08:00
Teng Ma
96a5e4dd79 [Feature] Support loading weights from ckpt engine worker (#11755)
Signed-off-by: Yang Kaiyong <yangkaiyong.yky@antgroup.com>
Signed-off-by: Cruz Zhao <CruzZhao@linux.alibaba.com>
Signed-off-by: Xuchun Shang <xuchun.shang@gmail.com>
Co-authored-by: Yang Kaiyong <yangkaiyong.yky@antgroup.com>
Co-authored-by: Cruz Zhao <CruzZhao@linux.alibaba.com>
Co-authored-by: Xuchun Shang <xuchun.shang@gmail.com>
Co-authored-by: Shangming Cai <csmthu@gmail.com>
2025-10-23 09:23:30 -07:00
Zhiyu
80b2b3207a Enable native ModelOpt quantization support (3/3) (#10154)
Signed-off-by: Zhiyu Cheng <zhiyuc@nvidia.com>
2025-10-21 21:44:29 -07:00
b8zhong
d0a64c7e2c vlm: enforce pybase64 for image and str encode/decode (#10700) 2025-10-21 19:05:32 +08:00
Neelabh Sinha
852c0578fd [FEATURE] Add OpenAI-Compatible LoRA Adapter Selection (#11570) 2025-10-21 15:44:33 +08:00
Kindyaa
c44e985dc2 feat(example/fastapi): support --startup-timeout using Qwen3-Next-80B-A3B-Instruct as example (#11710)
Co-authored-by: chenan01 <chenan01@cheche-MacBook-Pro.local>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-10-19 02:50:34 +08:00
Xu Wenqing
85c1f79377 Add DeepSeek-V3.2 Tool Call Template (#11063)
Signed-off-by: 许文卿 <xwq391974@alibaba-inc.com>
2025-10-04 18:53:49 -07:00
fzyzcjy
fdc4e1e570 Tiny move files to utils folder (#11166) 2025-10-03 22:40:06 +08:00
Chang Su
c1815a99b7 model support: Sarashina2VisionForCausalLM (#10632) 2025-09-18 17:30:38 -07:00
Feng Su
4c21b09074 [Feature] Sglang Tracing: Fine-Grained Tracking for Request Latency - Part 1 (#9962)
Signed-off-by: Feng Su <sufeng@linux.alibaba.com>
Signed-off-by: Huaixin Chang <changhuaixin@linux.alibaba.com>
Signed-off-by: Peng Wang <rocking@linux.alibaba.com>
2025-09-15 02:08:02 +08:00
Yiming
2cd94dd07e tool-call(dsv3): Fixed a parse problem when there are multiple function definitions in tool_calls (#10209) 2025-09-09 15:47:28 +08:00
Grace Ho
73179b764a nsys profile output kernel classifier (#9314)
Signed-off-by: Grace Ho <grho@nvidia.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Yineng Zhang <me@zhyncs.com>
2025-09-03 16:22:33 -07:00
南京小汤包
cc9a31c662 Update tool_chat_template_deepseekv31.jinja (#9895) 2025-09-02 20:29:21 -07:00
Lianmin Zheng
60e37f8028 Move parsers under a single folder (#9912) 2025-09-02 18:25:04 -07:00
wangyu
9f81d741a2 fix: fix MLA for ShardedModelLoader/RemoteModelLoader (#6287)
Signed-off-by: wangyu <wangyu.steph@bytedance.com>
2025-08-28 16:10:09 -07:00
wangyu
a38c149758 feat(draft_model): support draft_model for RemoteModelLoader (#6407)
Signed-off-by: wangyu <wangyu.steph@bytedance.com>
2025-08-28 16:09:52 -07:00
Xu Wenqing
b9683be653 Support DeepSeek-V3.1 tool call (#9446)
Signed-off-by: 许文卿 <xwq391974@alibaba-inc.com>
Co-authored-by: Xinyuan Tong <xinyuantong.cs@gmail.com>
2025-08-26 20:22:19 -07:00
Chang Su
c9dd70fbde tool-call(dsv3): Improve deepseek-v3 chat template and tool_choice = required (#9525) 2025-08-23 01:46:56 -07:00
PGFLMG
b7cd743038 [Feat] QWen-1M context support[2/2]: Update block sparse attention backend (#5949) 2025-08-06 23:49:36 -07:00
yi wang
5963e50503 [bugfix] Fix 2 minor bugs in the hicache storage layer (#8404) 2025-07-31 05:47:14 +00:00
Jinn
ab74f8f09d Remove batches api in docs & example (#7400) 2025-06-20 19:46:31 -07:00
Ata Fatahi
1ab6be1b26 Purge VerlEngine (#7326)
Signed-off-by: Ata Fatahi <immrata@gmail.com>
2025-06-19 23:47:21 -07:00
kyle-pena-kuzco
b56de8f943 Open AI API hidden states (#6716) 2025-06-10 14:37:29 -07:00
Chao Yang
4fac524b14 update llama4 chat template and pythonic parser (#6679)
Co-authored-by: Chang Su <chang.s.su@oracle.com>
2025-05-30 17:01:22 -07:00
Xu Wenqing
62cac2c43a Update DeepSeek-R1-0528 function call chat template (#6765)
Signed-off-by: 许文卿 <xwq391974@alibaba-inc.com>
2025-05-30 00:42:57 -07:00
Xu Wenqing
f4d4f93928 Add DeepSeek-R1-0528 function call chat template (#6725)
Signed-off-by: 许文卿 <xwq391974@alibaba-inc.com>
2025-05-29 00:05:07 -07:00
Lifu Huang
3cf1473a09 Use monotonic clock for interval measurement (#6211)
Signed-off-by: Lifu Huang <lifu.hlf@gmail.com>
2025-05-17 16:49:18 -07:00
Kiv Chen
5380cd7ea3 model(vlm): pixtral (#5084) 2025-05-13 00:16:10 -07:00
Lianmin Zheng
e8e18dcdcc Revert "fix some typos" (#6244) 2025-05-12 12:53:26 -07:00
applesaucethebun
d738ab52f8 fix some typos (#6209)
Co-authored-by: Brayden Zhong <b8zhong@uwaterloo.ca>
2025-05-13 01:42:38 +08:00
mlmz
69276f619a doc: fix the erroneous documents and example codes about Alibaba-NLP/gme-Qwen2-VL-2B-Instruct (#6199) 2025-05-11 08:22:11 -07:00
applesaucethebun
2ce8793519 Add typo checker in pre-commit (#6179)
Co-authored-by: Brayden Zhong <b8zhong@uwaterloo.ca>
2025-05-11 12:55:00 +08:00