Xinyuan Tong
|
6d03861476
|
support Hy3 preview (#23533)
Co-authored-by: pengmeng <pengmeng@tencent.com>
Co-authored-by: Qiaolin-Yu <liin1211@outlook.com>
Co-authored-by: chengvjiang <chengvjiang@tencent.com>
Co-authored-by: russellfeng <russellfeng@tencent.com>
|
2026-04-24 12:03:24 -07:00 |
|
Mohammad Miadh Angkad
|
bcc0c65aa8
|
[DSA] Hopper FP8 FlashMLA KV padding (#22372)
|
2026-04-12 02:19:17 -07:00 |
|
Zhangheng
|
3d3a32c0b9
|
[HiSparse]: Add readme docs for HiSparse Feature (#22238)
|
2026-04-07 00:39:24 -07:00 |
|
Mohammad Miadh Angkad
|
b311db2e49
|
[Doc] Fix and improve DeepSeek V3.2/GLM-5 documentation (#22179)
|
2026-04-05 23:26:42 -07:00 |
|
Baizhou Zhang
|
106baedbfb
|
[Doc] Update GLM-5 instructions in sglang documentation (#21716)
|
2026-04-05 03:13:07 -07:00 |
|
David Cheung
|
ed427e1299
|
Migrate all callers from /get_server_info to /server_info (#21463)
|
2026-04-01 21:17:50 -07:00 |
|
Артем Савкин
|
27071e0a43
|
[NPU] Update quantization&CI documentation (#21100)
Co-authored-by: Tamir Baydasov <41994229+TamirBaydasov@users.noreply.github.com>
|
2026-03-28 21:42:21 +03:00 |
|
SevenJ
|
2e65c27b29
|
Api add flush cache timeout (#21413)
Signed-off-by: root <wenjun7j@gmail.com>
|
2026-03-26 14:44:37 -07:00 |
|
Jiaxin(Jackson) Deng
|
c4db64c16b
|
Add Lychee Doc Links Check to Local and CI (#19742)
Co-authored-by: Zijie Xia <zijie_xia@icloud.com>
Co-authored-by: Zijie Xia <zijiexia@users.noreply.github.com>
Co-authored-by: zijiexia <37504505+zijiexia@users.noreply.github.com>
|
2026-03-24 13:48:26 -07:00 |
|
Mook
|
2720ea2667
|
[Typo] Fix H200 doc links pointing to H20 section in deepseek_v3.md (#20383)
|
2026-03-11 13:35:20 -07:00 |
|
shuwenn
|
5a11ae19c1
|
[CI] fix: notebook ci often OOM (#20199)
|
2026-03-09 22:32:41 -07:00 |
|
shuwenn
|
7bd3dd9270
|
fix: image URL in notebook to use raw.githubusercontent.com (#20100)
|
2026-03-07 13:28:20 -08:00 |
|
Baidu-AIAK
|
6851613b93
|
[Bugfix] For cp: Fixed hang problem in prefix cache and kvcache support fp8 in-seq-split mode (#19656)
Co-authored-by: vincent <vincent@vincentdeMacBook-Pro.local>
|
2026-03-03 19:19:46 -08:00 |
|
Michael
|
6b8e62f94f
|
[AMD] [Qwen 3.5 Day 0] Add Qwen 3.5 nightly accuracy tests (#19479)
|
2026-03-02 19:42:42 -08:00 |
|
Michael
|
403195d59d
|
[AMD] [MiniMax-M2.5 Day 0] Add MiniMax-M2.5 nightly accuracy test (#19443)
|
2026-02-27 02:39:33 -08:00 |
|
赵晨阳
|
e239f8aa85
|
Remove error dllm and diffusion doc in basic_useage (#19105)
|
2026-02-20 20:28:00 -08:00 |
|
Rain Jiang
|
0ffd0a3995
|
Nsa trtllm mla sparse fp8 support with Deepseek v3.2 NVFP4 (#18389)
|
2026-02-16 09:29:54 +08:00 |
|
SoluMilken
|
07a24f1a38
|
update pre-commit config (#18860)
|
2026-02-16 00:18:31 +08:00 |
|
shuwenn
|
3299c4f9c1
|
[CI] feat: add early exit to wait_for_server when process dies (#18602)
|
2026-02-13 16:46:09 -08:00 |
|
dongjiyingdjy
|
8b4c364960
|
refactor context parallel state (#17213)
Co-authored-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
|
2026-02-13 23:18:17 +08:00 |
|
qianyue76
|
f06ab17a73
|
[diffusion] docs: consolidate diffusion documentation into docs (#18095)
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: JiaxinD <djx2048@gmail.com>
|
2026-02-11 16:55:07 -08:00 |
|
Baizhou Zhang
|
947927bdb5
|
[V3.2] Change default CP token split method to --round-robin-split (#18613)
|
2026-02-11 20:14:35 +08:00 |
|
Rishit Shivam
|
c850a8a41a
|
[Docs] Add Falcon H1, Hunyuan-Large, Qwen3-Omni support and update Diffusion usage (#17888)
Co-authored-by: Rishitshivam <164783543+Rishitshivam@users.noreply.github.com>
Co-authored-by: Ratish P <114130421+Ratish1@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Adarsh Shirawalmath <114558126+adarshxs@users.noreply.github.com>
Co-authored-by: zhaochenyang20 <zhaochen20@outlook.com>
|
2026-02-06 13:17:51 -08:00 |
|
rinbaro
|
de6a03260f
|
[docs] fix misspellings & typos (#18276)
|
2026-02-05 03:35:29 +00:00 |
|
sglang-bot
|
c971852ffc
|
docs: move deepseek_ocr to popular model usage and add cookbook reference (#18120)
|
2026-02-02 05:45:41 -08:00 |
|
baonudesifeizhai
|
84ab611af8
|
model: support DeepSeek-OCR-2 (#17897)
|
2026-01-30 09:49:51 +08:00 |
|
Baizhou Zhang
|
1d942e4eef
|
[DeepSeek] Update tests and document for DeepSeek V3.2 NVFP4 checkpoint (#17657)
|
2026-01-27 22:10:57 +08:00 |
|
Hubert Lu
|
df42f4d386
|
[AMD] Update dsv3.2 AMD GPU docs and unify ROCm TileLang build (#17783)
Co-authored-by: wufann <715544327@qq.com>
|
2026-01-26 21:10:32 -08:00 |
|
Mansoor
|
bdaa3de075
|
Add return routed experts to the completions and chat/completions endpoints (#17434)
|
2026-01-23 12:12:36 -08:00 |
|
Yi Zhong
|
458fe5a337
|
[docs] Show user the fastAPI docs available (#17510)
Signed-off-by: vincentzed <207368749+vincentzed@users.noreply.github.com>
|
2026-01-21 14:26:25 +00:00 |
|
b8zhong
|
3d72944fb8
|
[Doc] Add tip on how to use Spec V2 (#15455)
|
2026-01-16 05:30:18 +08:00 |
|
Guy Stone
|
cd23c2f0a3
|
[Docs] add v1/score api to native api documentation (#16568)
|
2026-01-15 12:29:40 -05:00 |
|
ybyang
|
2122fea3c4
|
Update deepseekV32 Cp doc (#17054)
|
2026-01-14 11:19:26 +08:00 |
|
ybyang
|
aab640c99f
|
add doc for dsv32 cp+pp (#16916)
|
2026-01-12 19:14:07 +08:00 |
|
hlu1
|
aeb480c11f
|
Add top-p to run_eval.py (#16844)
|
2026-01-10 17:10:37 +08:00 |
|
Ke Bao
|
3aa11ca722
|
Remove hybrid_kvcache_ratio in server args (#16399)
|
2026-01-06 13:13:13 +08:00 |
|
Baizhou Zhang
|
f07e76b229
|
Multiple refactors of DeepSeek V32 and context parallel (#16305)
|
2026-01-03 02:21:22 +08:00 |
|
Yongfei Xu
|
0d244116d2
|
[DeepSeek v3.2] opt Context Parallelism: support fused moe, multi batch and fp8 kvcache (#13959)
|
2026-01-02 23:49:14 +08:00 |
|
Roger Young
|
5c64a20da7
|
Update MiniMax-M2 ToolCall and add MiniMax-M2.1 in Docs (#15538)
Co-authored-by: xuebi <xuebi@minimaxi.com>
Co-authored-by: Xinyuan Tong <115166877+JustinTong0323@users.noreply.github.com>
|
2025-12-23 15:11:52 -08:00 |
|
mlmz
|
1f1f05a85e
|
vlm: refactor engine vlm params and support processor output as input (#14091)
Co-authored-by: Mick <mickjagger19@icloud.com>
Co-authored-by: zhaochenyang20 <zhaochenyang20@gmail.com>
Co-authored-by: Xinyuan Tong <115166877+JustinTong0323@users.noreply.github.com>
Co-authored-by: BenYao21 <cyao22@asu.edu>
Co-authored-by: minleminzui <minleminzui@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: 赵晨阳 <zhaochen20@outlook.com>
|
2025-12-20 18:31:24 +08:00 |
|
Yuxuan Zhang
|
b82c7a0ae7
|
[GLM-4.7] GLM-4.7 Tool Parser and Doc Update (#15333)
|
2025-12-19 20:30:44 -08:00 |
|
Yi Zhang
|
9d4f066fb9
|
Add doc for qwen3 next (#15337)
|
2025-12-17 17:53:07 -08:00 |
|
b8zhong
|
d20699a33c
|
[Deepseek V3.2] Support Overlap Spec + NSA (#15307)
Co-authored-by: Brayden Zhong <b8zhong@users.noreply.github.com>
|
2025-12-17 13:35:39 -08:00 |
|
Ashton Chew
|
2bdbaef18e
|
[DeepSeekV3.2] Add pure TP+MTP test (#15088)
Co-authored-by: Baizhou Zhang <sobereddiezhang@gmail.com>
|
2025-12-16 21:48:12 -08:00 |
|
Alison Shao
|
31d48d7f6f
|
Add Ollama-compatible API endpoints + Smart Router (#14376)
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Xinyuan Tong <115166877+JustinTong0323@users.noreply.github.com>
|
2025-12-16 20:43:38 -08:00 |
|
almaslof
|
d0f756aec9
|
[docs] Fix kernel name (#14887)
|
2025-12-11 10:48:16 -05:00 |
|
Binyao Jiang
|
cf0478d602
|
[Glm46v] Bug fix for accuracy drop and unable to launch server (#14585)
Co-authored-by: yhyang201 <yhyang201@gmail.com>
Co-authored-by: zRzRzRzRzRzRzR <2448370773@qq.com>
Co-authored-by: Minglei Zhu <mingleizhu1122@gmail.com>
|
2025-12-07 23:45:02 -08:00 |
|
George Armstrong
|
91c9c14c28
|
DOC update nemo-skills in docs (#14555)
Signed-off-by: George Armstrong <georgea@nvidia.com>
Co-authored-by: Baizhou Zhang <sobereddiezhang@gmail.com>
|
2025-12-06 19:03:08 -08:00 |
|
Lee Nau
|
5f6f550af8
|
Update DeepSeek V3 docs to use B200 (#14447)
|
2025-12-06 17:22:11 -08:00 |
|
Baizhou Zhang
|
42fcf5438f
|
Revert "tiny remove deprecated endpoint call" (#14533)
|
2025-12-05 23:48:54 -08:00 |
|