Commit Graph

100 Commits

Author SHA1 Message Date
b8zhong
ec7b2c16d9 tiny remove deprecated endpoint call (#13607) 2025-12-05 09:54:49 -08:00
Baizhou Zhang
7e78825d5a [Tiny]Small fixes in deepseek v32 doc (#14372)
Co-authored-by: Xinyuan Tong <115166877+JustinTong0323@users.noreply.github.com>
2025-12-03 11:35:40 -08:00
VDV1985
dc1635023f [NPU][Doc] updated installation guide for Ascend NPU (#13585)
Co-authored-by: Howeee <15935120809@163.com>
Co-authored-by: ronnie_zheng <zl19940307@163.com>
2025-12-03 16:58:49 +03:00
b8zhong
65c8568c4a sync attention, deepseek doc (#14335)
Co-authored-by: Brayden Zhong <b8zhong@users.noreply.github.com>
2025-12-02 21:19:40 -08:00
Baizhou Zhang
4bcc5879af [Doc] Fix DeepSeek V32 Doc (#14336) 2025-12-02 21:06:55 -08:00
Baizhou Zhang
922054079c [Doc] Update DeepSeek-V3.2 document (#14321) 2025-12-02 18:19:39 -08:00
b8zhong
e6420100ee sync attention doc and ep doc to doctree (#14257)
Co-authored-by: Brayden Zhong <b8zhong@users.noreply.github.com>
2025-12-01 21:15:22 -08:00
Lianmin Zheng
bc3d2a85af [Minor] update docs (#14212) 2025-12-01 02:33:58 -08:00
YAMY
decb48965d [DeepSeekV3.2] Enable pure TP & Partial DP Attention (#13646) 2025-11-30 15:59:23 -08:00
Jan Bernlöhr
fcccaf9001 Add Llama4 attention backend auto-selection (#13421)
Signed-off-by: jbernloehr <jbernloehr@nvidia.com>
2025-11-25 11:54:21 -08:00
Peiqi Yin
a90435c059 Fix typo in docs (#13709) 2025-11-23 10:49:49 +08:00
hlu1
7291c72e57 [Deepseek V3.2] Change indexer weights_proj to fp32 (#13459) 2025-11-20 12:24:10 -08:00
Liangsheng Yin
196b940aed [3/N] CI refactor: move some manually triggered tests. (#13448) 2025-11-19 23:06:53 +08:00
Binyao Jiang
90c18a16cb [GLM4.6v] Required changes for bumping up to transformer 5.x (#13229) 2025-11-18 10:58:00 +08:00
lixiaolx
d368c7451a (1/n)support context parallel with deepseekv3.2-DSA (#12065) 2025-11-16 20:12:25 -08:00
Lianmin Zheng
7e626d12b7 Update docs (#13391)
Co-authored-by: sglang-bot <sglangbot@gmail.com>
2025-11-16 19:36:33 -08:00
Zesen SenmiaoORZ
fd3be107bb [Doc] Add item for repetition punishment (#13260) 2025-11-14 11:15:56 -08:00
Mick
ddfcb7c8ab minor: fix notebook bug with new model_info fields added for warmup (#13005) 2025-11-11 00:46:12 +08:00
Adarsh Shirawalmath
583bb1804e [Docs] Add docs for Qwen3-VL image and video support (#12554)
Co-authored-by: Ubuntu <azureuser@athena.w2cgneqjjboeneyk2w5mje3jyf.bx.internal.cloudapp.net>
2025-11-10 12:16:04 +08:00
YAMY
190002c613 [Docs][DeepseekV3.2] Update deepseekv3.2 docs for mha short seq prefill (#12868) 2025-11-08 00:11:02 -08:00
Mattheliu
c3bb348dad [Docs] fix dead links in multiple documentation pages (#12764) 2025-11-06 10:49:32 -08:00
Baizhou Zhang
621dfb8886 Import flash_mla from sgl-kernel (#12135) 2025-10-29 23:54:21 -07:00
hlu1
0ee831dee0 Update deepseek_v32.md (#12296) 2025-10-28 14:52:38 -07:00
ybyang
9c6e25d2a6 doc for logit_bias (#12188) 2025-10-28 10:32:12 -07:00
Baizhou Zhang
97828878d8 [Doc] Small update of DeepSeek v3.2 document (#12138) 2025-10-25 20:34:05 -07:00
Baizhou Zhang
bcecf27e7c [Doc] Fix format for deepseek v3.2 document (#12130) 2025-10-25 15:07:50 -07:00
Baizhou Zhang
729b242934 [Doc] Add documentation for DeepSeek V3.2 (#11877)
Co-authored-by: Xinyuan Tong <115166877+JustinTong0323@users.noreply.github.com>
Co-authored-by: ybyang <ybyang7@iflytek.com>
2025-10-24 19:06:22 -07:00
ybyang
dbb16bedd5 Support Thinking Budget (via custom_logit_processor for OpenAI API) [Fix #6572] (#11416)
Signed-off-by: ybyang <ybyang7@iflytek.com>
Co-authored-by: YorkSu <york_su@qq.com>
2025-10-21 16:27:56 +08:00
Neelabh Sinha
852c0578fd [FEATURE] Add OpenAI-Compatible LoRA Adapter Selection (#11570) 2025-10-21 15:44:33 +08:00
Liangsheng Yin
acc2327bbd Move deep gemm related arguments to sglang.srt.environ (#11547) 2025-10-14 00:34:35 +08:00
Glen Liu
47c606d3dc [Feature] support regex strings as a stopping condition (#10635) 2025-10-12 10:53:15 +08:00
Adarsh Shirawalmath
7c3f07dbcb [Feature] Add /tokenize and /detokenize OpenAI compatible endpoints (#9545) 2025-10-08 12:38:48 +08:00
Xinyuan Tong
c4d77774e1 update sampling_params documentation with defaults (#11315) 2025-10-07 18:36:26 -07:00
Philip Kiely - Baseten
7f028b07c4 Fix formatting in long code blocks (#10528) 2025-09-16 12:02:05 -07:00
Vincent Zhong
0b14159fc4 Add reasoning examples for GPT-OSS in Markdown examples (#9626) 2025-09-15 11:27:40 +08:00
Yi Zhang
760b788a58 add qwen3-next doc (#10327) 2025-09-11 14:29:11 -07:00
geray
ba066ca02f Update link for EAGLE speculative decoding (#10191) 2025-09-09 11:09:50 +08:00
eigen
b0fcbb74d0 [DOC]: some minor updates (#10134) 2025-09-07 14:58:15 -07:00
Huapeng Zhou
75ee00112d [Doc] Fix SGLang tool parser doc (#9886) 2025-09-04 21:52:53 +08:00
Chayenne
9b08d975a0 [docs] Refactor, remove compiled results and add gpt-oss (#9613)
Co-authored-by: zhaochenyang20 <zhaochenyang20@gmail.com>
2025-08-25 15:27:06 -07:00
Xinyuan Tong
ca4b86c564 fix: Update OpenAI client base URL in documentation (#9576) 2025-08-24 23:06:57 -07:00
Xiaotong Jiang
80425e59bb [doc] deepseekv31 support (#9544) 2025-08-23 16:54:58 -07:00
Xinyuan Tong
fedfe91c1a [Docs] Add doc and quick demo for gpt-oss responses api & buildin tools (#9497)
Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>
2025-08-21 23:51:52 -07:00
Xinyuan Tong
13ec8d427e [Docs]Update reasoning parser doc & fix outdated link (#9492)
Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>
2025-08-21 22:08:28 -07:00
Xinyuan Tong
0b3a5b1151 Update reasoning parser doc (#9468)
Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>
2025-08-21 17:25:30 -07:00
Xinyuan Tong
e8449ab515 Add deepseek v3.1 thinking parser support and update docs (#9464)
Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>
2025-08-21 15:09:40 -07:00
Chengxing Xie
c1c7dc4534 feat: Add model version tracking with API endpoints and response metadata (#8795) 2025-08-14 12:13:46 -07:00
li chaoran
2ecbd8b8bf [feat] add ascend readme and docker release (#8700)
Signed-off-by: mywaaagh_admin <pkwarcraft@gmail.com>
Signed-off-by: lichaoran <pkwarcraft@gmail.com>
Co-authored-by: Even Zhou <even.y.zhou@outlook.com>
Co-authored-by: ronnie_zheng <zl19940307@163.com>
2025-08-12 13:25:42 -07:00
Lianmin Zheng
2e8e7e353b Improve docs and developer guide (#9044) 2025-08-10 21:05:18 -07:00
Lianmin Zheng
2449a0afe2 Refactor the docs (#9031) 2025-08-10 19:49:45 -07:00