Jiaxin(Jackson) Deng
|
c4db64c16b
|
Add Lychee Doc Links Check to Local and CI (#19742)
Co-authored-by: Zijie Xia <zijie_xia@icloud.com>
Co-authored-by: Zijie Xia <zijiexia@users.noreply.github.com>
Co-authored-by: zijiexia <37504505+zijiexia@users.noreply.github.com>
|
2026-03-24 13:48:26 -07:00 |
|
Xiaoyu Zhang
|
be7a0311a0
|
[Diffusion] Fix and validate diffusion skills benchmarking/profiling workflow (#20528)
|
2026-03-13 21:11:37 +08:00 |
|
Xinyuan Tong
|
4a757990a1
|
[VLM] Replace decord with torchcodec for video decoding (#20055)
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: BakerBunker <17872844+BakerBunker@users.noreply.github.com>
|
2026-03-09 19:23:49 +08:00 |
|
Julian Huang
|
a55f658835
|
[Misc] Normalize --host parameter to use plain hostname without scheme (#19309)
Co-authored-by: 墨楼 <huangzhilin.hzl@antgroup.com>
Co-authored-by: Liangsheng Yin <lsyincs@gmail.com>
Co-authored-by: Liangsheng Yin <hnyls2002@gmail.com>
|
2026-02-25 00:37:24 -08:00 |
|
SoluMilken
|
07a24f1a38
|
update pre-commit config (#18860)
|
2026-02-16 00:18:31 +08:00 |
|
shuwenn
|
3299c4f9c1
|
[CI] feat: add early exit to wait_for_server when process dies (#18602)
|
2026-02-13 16:46:09 -08:00 |
|
shuwenn
|
de94d793ad
|
feat: support qwen3(-VL) rerank scoring&chat template (#16403)
Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>
Co-authored-by: Xinyuan Tong <xinyuantong.cs@gmail.com>
|
2026-01-15 00:45:46 +08:00 |
|
Liangsheng Yin
|
a435f55d18
|
Tiny print launch command with shlex (#16010)
|
2025-12-29 11:26:46 +08:00 |
|
ゆり
|
186a56f6e2
|
fix(monitoring): update Grafana dashboard metrics prefix from sglang: to sglang_ (#15758)
|
2025-12-24 10:43:51 -08:00 |
|
Xinyuan Tong
|
47cdb65a45
|
fix: update argument extraction in R1 chat template (#15547)
Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>
|
2025-12-21 09:18:49 +08:00 |
|
Yineng Zhang
|
ef1ab2302a
|
[Auto Sync] Update tool_chat_template_deepseekv31.jinja (20251210) (#14837)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Jue Wang <zjuwangjue@gmail.com>
|
2025-12-10 10:56:24 -08:00 |
|
Lianmin Zheng
|
bc3d2a85af
|
[Minor] update docs (#14212)
|
2025-12-01 02:33:58 -08:00 |
|
Baizhou Zhang
|
808b6dfdea
|
[Minor] Fix lint (#13938)
|
2025-11-25 10:57:23 -08:00 |
|
Simo Lin
|
4852aa054c
|
[misc] add llama3.1 chat template (#13935)
|
2025-11-25 09:31:54 -08:00 |
|
Liangsheng Yin
|
196b940aed
|
[3/N] CI refactor: move some manually triggered tests. (#13448)
|
2025-11-19 23:06:53 +08:00 |
|
Kangyan-Zhou
|
ea89a3a0c5
|
Fixes validation errors for Wan-AI models which store model weights in subdirectories (#13461)
|
2025-11-17 15:33:02 -08:00 |
|
Sirut Buasai
|
a63f433b6f
|
extend sagemaker.Dockerfile serve script to allow all sglang serve flags (#13173)
|
2025-11-17 13:14:17 -08:00 |
|
Mattheliu
|
c3bb348dad
|
[Docs] fix dead links in multiple documentation pages (#12764)
|
2025-11-06 10:49:32 -08:00 |
|
Kangyan-Zhou
|
7e28c67d19
|
Fix DeepSeek chat templates to handle tool call arguments type checking (#11700) (#12123)
|
2025-10-30 16:39:25 +08:00 |
|
Teng Ma
|
96a5e4dd79
|
[Feature] Support loading weights from ckpt engine worker (#11755)
Signed-off-by: Yang Kaiyong <yangkaiyong.yky@antgroup.com>
Signed-off-by: Cruz Zhao <CruzZhao@linux.alibaba.com>
Signed-off-by: Xuchun Shang <xuchun.shang@gmail.com>
Co-authored-by: Yang Kaiyong <yangkaiyong.yky@antgroup.com>
Co-authored-by: Cruz Zhao <CruzZhao@linux.alibaba.com>
Co-authored-by: Xuchun Shang <xuchun.shang@gmail.com>
Co-authored-by: Shangming Cai <csmthu@gmail.com>
|
2025-10-23 09:23:30 -07:00 |
|
Zhiyu
|
80b2b3207a
|
Enable native ModelOpt quantization support (3/3) (#10154)
Signed-off-by: Zhiyu Cheng <zhiyuc@nvidia.com>
|
2025-10-21 21:44:29 -07:00 |
|
b8zhong
|
d0a64c7e2c
|
vlm: enforce pybase64 for image and str encode/decode (#10700)
|
2025-10-21 19:05:32 +08:00 |
|
Neelabh Sinha
|
852c0578fd
|
[FEATURE] Add OpenAI-Compatible LoRA Adapter Selection (#11570)
|
2025-10-21 15:44:33 +08:00 |
|
Kindyaa
|
c44e985dc2
|
feat(example/fastapi): support --startup-timeout using Qwen3-Next-80B-A3B-Instruct as example (#11710)
Co-authored-by: chenan01 <chenan01@cheche-MacBook-Pro.local>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
2025-10-19 02:50:34 +08:00 |
|
Xu Wenqing
|
85c1f79377
|
Add DeepSeek-V3.2 Tool Call Template (#11063)
Signed-off-by: 许文卿 <xwq391974@alibaba-inc.com>
|
2025-10-04 18:53:49 -07:00 |
|
fzyzcjy
|
fdc4e1e570
|
Tiny move files to utils folder (#11166)
|
2025-10-03 22:40:06 +08:00 |
|
Chang Su
|
c1815a99b7
|
model support: Sarashina2VisionForCausalLM (#10632)
|
2025-09-18 17:30:38 -07:00 |
|
Feng Su
|
4c21b09074
|
[Feature] Sglang Tracing: Fine-Grained Tracking for Request Latency - Part 1 (#9962)
Signed-off-by: Feng Su <sufeng@linux.alibaba.com>
Signed-off-by: Huaixin Chang <changhuaixin@linux.alibaba.com>
Signed-off-by: Peng Wang <rocking@linux.alibaba.com>
|
2025-09-15 02:08:02 +08:00 |
|
Yiming
|
2cd94dd07e
|
tool-call(dsv3): Fixed a parse problem when there are multiple function definitions in tool_calls (#10209)
|
2025-09-09 15:47:28 +08:00 |
|
Grace Ho
|
73179b764a
|
nsys profile output kernel classifier (#9314)
Signed-off-by: Grace Ho <grho@nvidia.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Yineng Zhang <me@zhyncs.com>
|
2025-09-03 16:22:33 -07:00 |
|
南京小汤包
|
cc9a31c662
|
Update tool_chat_template_deepseekv31.jinja (#9895)
|
2025-09-02 20:29:21 -07:00 |
|
Lianmin Zheng
|
60e37f8028
|
Move parsers under a single folder (#9912)
|
2025-09-02 18:25:04 -07:00 |
|
wangyu
|
9f81d741a2
|
fix: fix MLA for ShardedModelLoader/RemoteModelLoader (#6287)
Signed-off-by: wangyu <wangyu.steph@bytedance.com>
|
2025-08-28 16:10:09 -07:00 |
|
wangyu
|
a38c149758
|
feat(draft_model): support draft_model for RemoteModelLoader (#6407)
Signed-off-by: wangyu <wangyu.steph@bytedance.com>
|
2025-08-28 16:09:52 -07:00 |
|
Xu Wenqing
|
b9683be653
|
Support DeepSeek-V3.1 tool call (#9446)
Signed-off-by: 许文卿 <xwq391974@alibaba-inc.com>
Co-authored-by: Xinyuan Tong <xinyuantong.cs@gmail.com>
|
2025-08-26 20:22:19 -07:00 |
|
Chang Su
|
c9dd70fbde
|
tool-call(dsv3): Improve deepseek-v3 chat template and tool_choice = required (#9525)
|
2025-08-23 01:46:56 -07:00 |
|
PGFLMG
|
b7cd743038
|
[Feat] QWen-1M context support[2/2]: Update block sparse attention backend (#5949)
|
2025-08-06 23:49:36 -07:00 |
|
yi wang
|
5963e50503
|
[bugfix] Fix 2 minor bugs in the hicache storage layer (#8404)
|
2025-07-31 05:47:14 +00:00 |
|
Jinn
|
ab74f8f09d
|
Remove batches api in docs & example (#7400)
|
2025-06-20 19:46:31 -07:00 |
|
Ata Fatahi
|
1ab6be1b26
|
Purge VerlEngine (#7326)
Signed-off-by: Ata Fatahi <immrata@gmail.com>
|
2025-06-19 23:47:21 -07:00 |
|
kyle-pena-kuzco
|
b56de8f943
|
Open AI API hidden states (#6716)
|
2025-06-10 14:37:29 -07:00 |
|
Chao Yang
|
4fac524b14
|
update llama4 chat template and pythonic parser (#6679)
Co-authored-by: Chang Su <chang.s.su@oracle.com>
|
2025-05-30 17:01:22 -07:00 |
|
Xu Wenqing
|
62cac2c43a
|
Update DeepSeek-R1-0528 function call chat template (#6765)
Signed-off-by: 许文卿 <xwq391974@alibaba-inc.com>
|
2025-05-30 00:42:57 -07:00 |
|
Xu Wenqing
|
f4d4f93928
|
Add DeepSeek-R1-0528 function call chat template (#6725)
Signed-off-by: 许文卿 <xwq391974@alibaba-inc.com>
|
2025-05-29 00:05:07 -07:00 |
|
Lifu Huang
|
3cf1473a09
|
Use monotonic clock for interval measurement (#6211)
Signed-off-by: Lifu Huang <lifu.hlf@gmail.com>
|
2025-05-17 16:49:18 -07:00 |
|
Kiv Chen
|
5380cd7ea3
|
model(vlm): pixtral (#5084)
|
2025-05-13 00:16:10 -07:00 |
|
Lianmin Zheng
|
e8e18dcdcc
|
Revert "fix some typos" (#6244)
|
2025-05-12 12:53:26 -07:00 |
|
applesaucethebun
|
d738ab52f8
|
fix some typos (#6209)
Co-authored-by: Brayden Zhong <b8zhong@uwaterloo.ca>
|
2025-05-13 01:42:38 +08:00 |
|
mlmz
|
69276f619a
|
doc: fix the erroneous documents and example codes about Alibaba-NLP/gme-Qwen2-VL-2B-Instruct (#6199)
|
2025-05-11 08:22:11 -07:00 |
|
applesaucethebun
|
2ce8793519
|
Add typo checker in pre-commit (#6179)
Co-authored-by: Brayden Zhong <b8zhong@uwaterloo.ca>
|
2025-05-11 12:55:00 +08:00 |
|