sglang/benchmark/kernels at cc63c99f112f781605017db018fced686cdc94de - sglang - Public git mirror

kvcache-ai/sglang

mirror of https://github.com/kvcache-ai/sglang.git synced 2026-07-03 13:57:04 +00:00

Files

History

satyamk7054 355127c2e9 Fix benchmark_sglang_fused_moe_triton.py (#18940 )

Co-authored-by: Satyam Kumar <satyamk@linkedin.com>

2026-02-17 17:25:37 -05:00

..

[1/N] Optimize All Reduce - Benchmark different AR operations (#13797 )

2026-01-26 22:44:13 +08:00

decoding_attention_triton

Fix benchmark import for should_use_tensor_core (#17232 )

2026-01-16 17:48:36 -05:00

fix(deepep): resolve benchmark failure on 4×IB-card setup by aligning tuning config with DeepEP commit bdd119f8 (#11965 )

2025-10-22 21:20:54 -07:00

[NVIDIA] Add fp8 gemm benchmark on blackwell (#13528 )

2025-11-19 19:35:00 -08:00

[sgl-kernel] Optimize concat_mla_k kernel (#10543 )

2025-09-28 23:04:22 +08:00

flashinfer_allreduce_fusion

[benchmark] add flashinfer_allreduce_fusion benchmark (#9937 )

2025-09-03 16:31:01 +08:00

fused_moe_triton

Fix benchmark_sglang_fused_moe_triton.py (#18940 )

2026-02-17 17:25:37 -05:00

Refactor tuning block wise kernel and opt Qwen/Qwen3-VL-32B-Instruct-FP8 (#14141 )

2025-12-08 09:24:58 +08:00

scheduler_batch

[test] add ut and bm for get_last_loc (#6746 )

2025-05-29 11:47:21 -07:00

sliding_window_attention_triton

Optimize triton swa kernel by skipping computation (#8860 )

2025-08-06 21:37:50 +08:00