Logo
Explore Help
Register Sign In
kvcache-ai/sglang
1
0
Fork 0
You've already forked sglang
mirror of https://github.com/kvcache-ai/sglang.git synced 2026-06-30 03:37:51 +00:00
Code Issues Packages Projects Releases Wiki Activity
Files
support_multi_protocol
sglang/benchmark/kernels
History
RunningLeon 335dbd60b4 Support Intern-S2-Preview (#24875)
2026-05-10 22:17:30 +08:00
..
all_reduce
[AMD][No-Merge] Simplify fused allreduce + RMSNorm and remove hidden_dim allowlist (#21986)
2026-04-11 23:47:08 -07:00
decoding_attention_triton
Fix benchmark import for should_use_tensor_core (#17232)
2026-01-16 17:48:36 -05:00
deepep
Add CLI args to conveniently support tuning more models (#12922)
2026-03-12 23:10:55 -07:00
deepseek
Fix Python 3.11 f-string lint error in deepgemm Blackwell benchmark (#22108)
2026-04-04 21:15:22 +08:00
elementwise
[Benchmark] use flashinfer bench_gpu_time instead of triton do_bench (#20305)
2026-03-12 04:04:30 +00:00
flashinfer_allreduce_fusion
[kernel slimming] Clean many useless sgl-kernel deprecated kernels (#20277)
2026-03-14 16:45:54 +08:00
fused_moe_triton
Support Intern-S2-Preview (#24875)
2026-05-10 22:17:30 +08:00
lora_csgmv
Add offline auto-tuning for LoRA CSGMV kernel (#20391)
2026-04-10 13:10:43 -07:00
quantization
feat: tiny improve fp8_gemm tune usage (#23912)
2026-04-28 07:47:46 -04:00
scheduler_batch
[Benchmark] use flashinfer bench_gpu_time instead of triton do_bench (#20305)
2026-03-12 04:04:30 +00:00
sliding_window_attention_triton
[Benchmark] use flashinfer bench_gpu_time instead of triton do_bench (#20305)
2026-03-12 04:04:30 +00:00
Powered by Gitea Version: 1.25.4 Page: 219ms Template: 6ms
English
Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API