sglang

mirror of https://github.com/kvcache-ai/sglang.git synced 2026-06-30 11:48:01 +00:00

Author	SHA1	Message	Date
Jiaxin(Jackson) Deng	c4db64c16b	Add Lychee Doc Links Check to Local and CI (#19742 ) Co-authored-by: Zijie Xia <zijie_xia@icloud.com> Co-authored-by: Zijie Xia <zijiexia@users.noreply.github.com> Co-authored-by: zijiexia <37504505+zijiexia@users.noreply.github.com>	2026-03-24 13:48:26 -07:00
Xiaoyu Zhang	be7a0311a0	[Diffusion] Fix and validate diffusion skills benchmarking/profiling workflow (#20528 )	2026-03-13 21:11:37 +08:00
Xinyuan Tong	4a757990a1	[VLM] Replace decord with torchcodec for video decoding (#20055 ) Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: BakerBunker <17872844+BakerBunker@users.noreply.github.com>	2026-03-09 19:23:49 +08:00
Julian Huang	a55f658835	[Misc] Normalize `--host` parameter to use plain hostname without scheme (#19309 ) Co-authored-by: 墨楼 <huangzhilin.hzl@antgroup.com> Co-authored-by: Liangsheng Yin <lsyincs@gmail.com> Co-authored-by: Liangsheng Yin <hnyls2002@gmail.com>	2026-02-25 00:37:24 -08:00
SoluMilken	07a24f1a38	update pre-commit config (#18860 )	2026-02-16 00:18:31 +08:00
shuwenn	3299c4f9c1	[CI] feat: add early exit to wait_for_server when process dies (#18602 )	2026-02-13 16:46:09 -08:00
shuwenn	de94d793ad	feat: support qwen3(-VL) rerank scoring&chat template (#16403 ) Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com> Co-authored-by: Xinyuan Tong <xinyuantong.cs@gmail.com>	2026-01-15 00:45:46 +08:00
Liangsheng Yin	a435f55d18	Tiny print launch command with `shlex` (#16010 )	2025-12-29 11:26:46 +08:00
ゆり	186a56f6e2	fix(monitoring): update Grafana dashboard metrics prefix from sglang: to sglang_ (#15758 )	2025-12-24 10:43:51 -08:00
Xinyuan Tong	47cdb65a45	fix: update argument extraction in R1 chat template (#15547 ) Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>	2025-12-21 09:18:49 +08:00
Yineng Zhang	ef1ab2302a	[Auto Sync] Update tool_chat_template_deepseekv31.jinja (20251210) (#14837 ) Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Jue Wang <zjuwangjue@gmail.com>	2025-12-10 10:56:24 -08:00
Lianmin Zheng	bc3d2a85af	[Minor] update docs (#14212 )	2025-12-01 02:33:58 -08:00
Baizhou Zhang	808b6dfdea	[Minor] Fix lint (#13938 )	2025-11-25 10:57:23 -08:00
Simo Lin	4852aa054c	[misc] add llama3.1 chat template (#13935 )	2025-11-25 09:31:54 -08:00
Liangsheng Yin	196b940aed	[3/N] CI refactor: move some manually triggered tests. (#13448 )	2025-11-19 23:06:53 +08:00
Kangyan-Zhou	ea89a3a0c5	Fixes validation errors for Wan-AI models which store model weights in subdirectories (#13461 )	2025-11-17 15:33:02 -08:00
Sirut Buasai	a63f433b6f	extend sagemaker.Dockerfile serve script to allow all sglang serve flags (#13173 )	2025-11-17 13:14:17 -08:00
Mattheliu	c3bb348dad	[Docs] fix dead links in multiple documentation pages (#12764 )	2025-11-06 10:49:32 -08:00
Kangyan-Zhou	7e28c67d19	Fix DeepSeek chat templates to handle tool call arguments type checking (#11700 ) (#12123 )	2025-10-30 16:39:25 +08:00
Teng Ma	96a5e4dd79	[Feature] Support loading weights from ckpt engine worker (#11755 ) Signed-off-by: Yang Kaiyong <yangkaiyong.yky@antgroup.com> Signed-off-by: Cruz Zhao <CruzZhao@linux.alibaba.com> Signed-off-by: Xuchun Shang <xuchun.shang@gmail.com> Co-authored-by: Yang Kaiyong <yangkaiyong.yky@antgroup.com> Co-authored-by: Cruz Zhao <CruzZhao@linux.alibaba.com> Co-authored-by: Xuchun Shang <xuchun.shang@gmail.com> Co-authored-by: Shangming Cai <csmthu@gmail.com>	2025-10-23 09:23:30 -07:00
Zhiyu	80b2b3207a	Enable native ModelOpt quantization support (3/3) (#10154 ) Signed-off-by: Zhiyu Cheng <zhiyuc@nvidia.com>	2025-10-21 21:44:29 -07:00
b8zhong	d0a64c7e2c	vlm: enforce pybase64 for image and str encode/decode (#10700 )	2025-10-21 19:05:32 +08:00
Neelabh Sinha	852c0578fd	[FEATURE] Add OpenAI-Compatible LoRA Adapter Selection (#11570 )	2025-10-21 15:44:33 +08:00
Kindyaa	c44e985dc2	feat(example/fastapi): support --startup-timeout using Qwen3-Next-80B-A3B-Instruct as example (#11710 ) Co-authored-by: chenan01 <chenan01@cheche-MacBook-Pro.local> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2025-10-19 02:50:34 +08:00
Xu Wenqing	85c1f79377	Add DeepSeek-V3.2 Tool Call Template (#11063 ) Signed-off-by: 许文卿 <xwq391974@alibaba-inc.com>	2025-10-04 18:53:49 -07:00
fzyzcjy	fdc4e1e570	Tiny move files to utils folder (#11166 )	2025-10-03 22:40:06 +08:00
Chang Su	c1815a99b7	model support: Sarashina2VisionForCausalLM (#10632 )	2025-09-18 17:30:38 -07:00
Feng Su	4c21b09074	[Feature] Sglang Tracing: Fine-Grained Tracking for Request Latency - Part 1 (#9962 ) Signed-off-by: Feng Su <sufeng@linux.alibaba.com> Signed-off-by: Huaixin Chang <changhuaixin@linux.alibaba.com> Signed-off-by: Peng Wang <rocking@linux.alibaba.com>	2025-09-15 02:08:02 +08:00
Yiming	2cd94dd07e	tool-call(dsv3): Fixed a parse problem when there are multiple function definitions in tool_calls (#10209 )	2025-09-09 15:47:28 +08:00
Grace Ho	73179b764a	nsys profile output kernel classifier (#9314 ) Signed-off-by: Grace Ho <grho@nvidia.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: Yineng Zhang <me@zhyncs.com>	2025-09-03 16:22:33 -07:00
南京小汤包	cc9a31c662	Update tool_chat_template_deepseekv31.jinja (#9895 )	2025-09-02 20:29:21 -07:00
Lianmin Zheng	60e37f8028	Move parsers under a single folder (#9912 )	2025-09-02 18:25:04 -07:00
wangyu	9f81d741a2	fix: fix MLA for ShardedModelLoader/RemoteModelLoader (#6287 ) Signed-off-by: wangyu <wangyu.steph@bytedance.com>	2025-08-28 16:10:09 -07:00
wangyu	a38c149758	feat(draft_model): support draft_model for RemoteModelLoader (#6407 ) Signed-off-by: wangyu <wangyu.steph@bytedance.com>	2025-08-28 16:09:52 -07:00
Xu Wenqing	b9683be653	Support DeepSeek-V3.1 tool call (#9446 ) Signed-off-by: 许文卿 <xwq391974@alibaba-inc.com> Co-authored-by: Xinyuan Tong <xinyuantong.cs@gmail.com>	2025-08-26 20:22:19 -07:00
Chang Su	c9dd70fbde	tool-call(dsv3): Improve deepseek-v3 chat template and tool_choice = `required` (#9525 )	2025-08-23 01:46:56 -07:00
PGFLMG	b7cd743038	[Feat] QWen-1M context support[2/2]: Update block sparse attention backend (#5949 )	2025-08-06 23:49:36 -07:00
yi wang	5963e50503	[bugfix] Fix 2 minor bugs in the hicache storage layer (#8404 )	2025-07-31 05:47:14 +00:00
Jinn	ab74f8f09d	Remove batches api in docs & example (#7400 )	2025-06-20 19:46:31 -07:00
Ata Fatahi	1ab6be1b26	Purge VerlEngine (#7326 ) Signed-off-by: Ata Fatahi <immrata@gmail.com>	2025-06-19 23:47:21 -07:00
kyle-pena-kuzco	b56de8f943	Open AI API hidden states (#6716 )	2025-06-10 14:37:29 -07:00
Chao Yang	4fac524b14	update llama4 chat template and pythonic parser (#6679 ) Co-authored-by: Chang Su <chang.s.su@oracle.com>	2025-05-30 17:01:22 -07:00
Xu Wenqing	62cac2c43a	Update DeepSeek-R1-0528 function call chat template (#6765 ) Signed-off-by: 许文卿 <xwq391974@alibaba-inc.com>	2025-05-30 00:42:57 -07:00
Xu Wenqing	f4d4f93928	Add DeepSeek-R1-0528 function call chat template (#6725 ) Signed-off-by: 许文卿 <xwq391974@alibaba-inc.com>	2025-05-29 00:05:07 -07:00
Lifu Huang	3cf1473a09	Use monotonic clock for interval measurement (#6211 ) Signed-off-by: Lifu Huang <lifu.hlf@gmail.com>	2025-05-17 16:49:18 -07:00
Kiv Chen	5380cd7ea3	model(vlm): pixtral (#5084 )	2025-05-13 00:16:10 -07:00
Lianmin Zheng	e8e18dcdcc	Revert "fix some typos" (#6244 )	2025-05-12 12:53:26 -07:00
applesaucethebun	d738ab52f8	fix some typos (#6209 ) Co-authored-by: Brayden Zhong <b8zhong@uwaterloo.ca>	2025-05-13 01:42:38 +08:00
mlmz	69276f619a	doc: fix the erroneous documents and example codes about Alibaba-NLP/gme-Qwen2-VL-2B-Instruct (#6199 )	2025-05-11 08:22:11 -07:00
applesaucethebun	2ce8793519	Add typo checker in pre-commit (#6179 ) Co-authored-by: Brayden Zhong <b8zhong@uwaterloo.ca>	2025-05-11 12:55:00 +08:00

1 2 3 4

169 Commits