sglang

mirror of https://github.com/kvcache-ai/sglang.git synced 2026-06-30 11:48:01 +00:00

Author	SHA1	Message	Date
Xinyuan Tong	6d03861476	support Hy3 preview (#23533 ) Co-authored-by: pengmeng <pengmeng@tencent.com> Co-authored-by: Qiaolin-Yu <liin1211@outlook.com> Co-authored-by: chengvjiang <chengvjiang@tencent.com> Co-authored-by: russellfeng <russellfeng@tencent.com>	2026-04-24 12:03:24 -07:00
Mohammad Miadh Angkad	bcc0c65aa8	[DSA] Hopper FP8 FlashMLA KV padding (#22372 )	2026-04-12 02:19:17 -07:00
Zhangheng	3d3a32c0b9	[HiSparse]: Add readme docs for HiSparse Feature (#22238 )	2026-04-07 00:39:24 -07:00
Mohammad Miadh Angkad	b311db2e49	[Doc] Fix and improve DeepSeek V3.2/GLM-5 documentation (#22179 )	2026-04-05 23:26:42 -07:00
Baizhou Zhang	106baedbfb	[Doc] Update GLM-5 instructions in sglang documentation (#21716 )	2026-04-05 03:13:07 -07:00
David Cheung	ed427e1299	Migrate all callers from /get_server_info to /server_info (#21463 )	2026-04-01 21:17:50 -07:00
Артем Савкин	27071e0a43	[NPU] Update quantization&CI documentation (#21100 ) Co-authored-by: Tamir Baydasov <41994229+TamirBaydasov@users.noreply.github.com>	2026-03-28 21:42:21 +03:00
SevenJ	2e65c27b29	Api add flush cache timeout (#21413 ) Signed-off-by: root <wenjun7j@gmail.com>	2026-03-26 14:44:37 -07:00
Jiaxin(Jackson) Deng	c4db64c16b	Add Lychee Doc Links Check to Local and CI (#19742 ) Co-authored-by: Zijie Xia <zijie_xia@icloud.com> Co-authored-by: Zijie Xia <zijiexia@users.noreply.github.com> Co-authored-by: zijiexia <37504505+zijiexia@users.noreply.github.com>	2026-03-24 13:48:26 -07:00
Mook	2720ea2667	[Typo] Fix H200 doc links pointing to H20 section in deepseek_v3.md (#20383 )	2026-03-11 13:35:20 -07:00
shuwenn	5a11ae19c1	[CI] fix: notebook ci often OOM (#20199 )	2026-03-09 22:32:41 -07:00
shuwenn	7bd3dd9270	fix: image URL in notebook to use raw.githubusercontent.com (#20100 )	2026-03-07 13:28:20 -08:00
Baidu-AIAK	6851613b93	[Bugfix] For cp: Fixed hang problem in prefix cache and kvcache support fp8 in-seq-split mode (#19656 ) Co-authored-by: vincent <vincent@vincentdeMacBook-Pro.local>	2026-03-03 19:19:46 -08:00
Michael	6b8e62f94f	[AMD] [Qwen 3.5 Day 0] Add Qwen 3.5 nightly accuracy tests (#19479 )	2026-03-02 19:42:42 -08:00
Michael	403195d59d	[AMD] [MiniMax-M2.5 Day 0] Add MiniMax-M2.5 nightly accuracy test (#19443 )	2026-02-27 02:39:33 -08:00
赵晨阳	e239f8aa85	Remove error dllm and diffusion doc in basic_useage (#19105 )	2026-02-20 20:28:00 -08:00
Rain Jiang	0ffd0a3995	Nsa trtllm mla sparse fp8 support with Deepseek v3.2 NVFP4 (#18389 )	2026-02-16 09:29:54 +08:00
SoluMilken	07a24f1a38	update pre-commit config (#18860 )	2026-02-16 00:18:31 +08:00
shuwenn	3299c4f9c1	[CI] feat: add early exit to wait_for_server when process dies (#18602 )	2026-02-13 16:46:09 -08:00
dongjiyingdjy	8b4c364960	refactor context parallel state (#17213 ) Co-authored-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>	2026-02-13 23:18:17 +08:00
qianyue76	f06ab17a73	[diffusion] docs: consolidate diffusion documentation into docs (#18095 ) Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: JiaxinD <djx2048@gmail.com>	2026-02-11 16:55:07 -08:00
Baizhou Zhang	947927bdb5	[V3.2] Change default CP token split method to `--round-robin-split` (#18613 )	2026-02-11 20:14:35 +08:00
Rishit Shivam	c850a8a41a	[Docs] Add Falcon H1, Hunyuan-Large, Qwen3-Omni support and update Diffusion usage (#17888 ) Co-authored-by: Rishitshivam <164783543+Rishitshivam@users.noreply.github.com> Co-authored-by: Ratish P <114130421+Ratish1@users.noreply.github.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: Adarsh Shirawalmath <114558126+adarshxs@users.noreply.github.com> Co-authored-by: zhaochenyang20 <zhaochen20@outlook.com>	2026-02-06 13:17:51 -08:00
rinbaro	de6a03260f	[docs] fix misspellings & typos (#18276 )	2026-02-05 03:35:29 +00:00
sglang-bot	c971852ffc	docs: move deepseek_ocr to popular model usage and add cookbook reference (#18120 )	2026-02-02 05:45:41 -08:00
baonudesifeizhai	84ab611af8	model: support DeepSeek-OCR-2 (#17897 )	2026-01-30 09:49:51 +08:00
Baizhou Zhang	1d942e4eef	[DeepSeek] Update tests and document for DeepSeek V3.2 NVFP4 checkpoint (#17657 )	2026-01-27 22:10:57 +08:00
Hubert Lu	df42f4d386	[AMD] Update dsv3.2 AMD GPU docs and unify ROCm TileLang build (#17783 ) Co-authored-by: wufann <715544327@qq.com>	2026-01-26 21:10:32 -08:00
Mansoor	bdaa3de075	Add return routed experts to the completions and chat/completions endpoints (#17434 )	2026-01-23 12:12:36 -08:00
Yi Zhong	458fe5a337	[docs] Show user the fastAPI docs available (#17510 ) Signed-off-by: vincentzed <207368749+vincentzed@users.noreply.github.com>	2026-01-21 14:26:25 +00:00
b8zhong	3d72944fb8	[Doc] Add tip on how to use Spec V2 (#15455 )	2026-01-16 05:30:18 +08:00
Guy Stone	cd23c2f0a3	[Docs] add v1/score api to native api documentation (#16568 )	2026-01-15 12:29:40 -05:00
ybyang	2122fea3c4	Update deepseekV32 Cp doc (#17054 )	2026-01-14 11:19:26 +08:00
ybyang	aab640c99f	add doc for dsv32 cp+pp (#16916 )	2026-01-12 19:14:07 +08:00
hlu1	aeb480c11f	Add top-p to run_eval.py (#16844 )	2026-01-10 17:10:37 +08:00
Ke Bao	3aa11ca722	Remove hybrid_kvcache_ratio in server args (#16399 )	2026-01-06 13:13:13 +08:00
Baizhou Zhang	f07e76b229	Multiple refactors of DeepSeek V32 and context parallel (#16305 )	2026-01-03 02:21:22 +08:00
Yongfei Xu	0d244116d2	[DeepSeek v3.2] opt Context Parallelism: support fused moe, multi batch and fp8 kvcache (#13959 )	2026-01-02 23:49:14 +08:00
Roger Young	5c64a20da7	Update MiniMax-M2 ToolCall and add MiniMax-M2.1 in Docs (#15538 ) Co-authored-by: xuebi <xuebi@minimaxi.com> Co-authored-by: Xinyuan Tong <115166877+JustinTong0323@users.noreply.github.com>	2025-12-23 15:11:52 -08:00
mlmz	1f1f05a85e	vlm: refactor engine vlm params and support processor output as input (#14091 ) Co-authored-by: Mick <mickjagger19@icloud.com> Co-authored-by: zhaochenyang20 <zhaochenyang20@gmail.com> Co-authored-by: Xinyuan Tong <115166877+JustinTong0323@users.noreply.github.com> Co-authored-by: BenYao21 <cyao22@asu.edu> Co-authored-by: minleminzui <minleminzui@gmail.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: 赵晨阳 <zhaochen20@outlook.com>	2025-12-20 18:31:24 +08:00
Yuxuan Zhang	b82c7a0ae7	[GLM-4.7] GLM-4.7 Tool Parser and Doc Update (#15333 )	2025-12-19 20:30:44 -08:00
Yi Zhang	9d4f066fb9	Add doc for qwen3 next (#15337 )	2025-12-17 17:53:07 -08:00
b8zhong	d20699a33c	[Deepseek V3.2] Support Overlap Spec + NSA (#15307 ) Co-authored-by: Brayden Zhong <b8zhong@users.noreply.github.com>	2025-12-17 13:35:39 -08:00
Ashton Chew	2bdbaef18e	[DeepSeekV3.2] Add pure TP+MTP test (#15088 ) Co-authored-by: Baizhou Zhang <sobereddiezhang@gmail.com>	2025-12-16 21:48:12 -08:00
Alison Shao	31d48d7f6f	Add Ollama-compatible API endpoints + Smart Router (#14376 ) Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: Xinyuan Tong <115166877+JustinTong0323@users.noreply.github.com>	2025-12-16 20:43:38 -08:00
almaslof	d0f756aec9	[docs] Fix kernel name (#14887 )	2025-12-11 10:48:16 -05:00
Binyao Jiang	cf0478d602	[Glm46v] Bug fix for accuracy drop and unable to launch server (#14585 ) Co-authored-by: yhyang201 <yhyang201@gmail.com> Co-authored-by: zRzRzRzRzRzRzR <2448370773@qq.com> Co-authored-by: Minglei Zhu <mingleizhu1122@gmail.com>	2025-12-07 23:45:02 -08:00
George Armstrong	91c9c14c28	DOC update nemo-skills in docs (#14555 ) Signed-off-by: George Armstrong <georgea@nvidia.com> Co-authored-by: Baizhou Zhang <sobereddiezhang@gmail.com>	2025-12-06 19:03:08 -08:00
Lee Nau	5f6f550af8	Update DeepSeek V3 docs to use B200 (#14447 )	2025-12-06 17:22:11 -08:00
Baizhou Zhang	42fcf5438f	Revert "tiny remove deprecated endpoint call" (#14533 )	2025-12-05 23:48:54 -08:00

1 2 3

101 Commits