sglang

mirror of https://github.com/kvcache-ai/sglang.git synced 2026-06-30 19:57:52 +00:00

Author	SHA1	Message	Date
Polisetty V R K Jyothendra Varma	f0303fd07e	[Intel GPU] Enable DeepSeek R1 inference on XPU (#18461 ) Signed-off-by: P V R K Jyothendra Varma <polisetty.v.r.k.jyothendra.varma@intel.com>	2026-03-29 22:35:59 -07:00
cs-cat	22e378af86	Fix result writer in tuning_block_wise_kernel.py, and add FP8 kernel config for L40 (#20368 ) Signed-off-by: cs-cat <118669451+cs-cat@users.noreply.github.com>	2026-03-20 09:28:54 +08:00
Mook	abc672e717	[Benchmark] use flashinfer bench_gpu_time instead of triton do_bench (#20305 )	2026-03-12 04:04:30 +00:00
Xiaoyu Zhang	03b835e7d1	Refactor tuning block wise kernel and opt Qwen/Qwen3-VL-32B-Instruct-FP8 (#14141 )	2025-12-08 09:24:58 +08:00
Shu Wang	6664083522	Replace [silu_and_mul_]scaled_fp4_group_quant by Flashinfer equivalent (#12376 )	2025-11-13 00:26:00 -08:00
Cheng Wan	5b214b50b6	[Refactor] move `deep_gemm_wrapper` out of `quantization` (#11784 )	2025-10-17 18:57:54 -07:00
Kaixi Hou	5c34b4f1c7	[NVIDIA] [2/N] Optimize `silu_and_mul_scaled_fp4_grouped_quant` perf (#9556 )	2025-08-29 17:17:03 -07:00
Lifu Huang	6e2da51561	Replace time.time() to time.perf_counter() for benchmarking. (#6178 ) Signed-off-by: Lifu Huang <lifu.hlf@gmail.com>	2025-05-11 14:32:49 -07:00
Zhaoyi Li	c555d794f7	Minor update for ROCm variable style (#5562 )	2025-04-19 23:45:27 -07:00
laixin	b0df5d240b	Tuning Script for Feature DeepSeek V3/R1 INT8 Quantization (block-wise) (#3922 ) Co-authored-by: sleepcoo <sleepcoo@gmail.com>	2025-02-27 10:59:46 +00:00
Xiaoyu Zhang	c38f3aed24	support multi-gpu block-gemm tuning (#3639 )	2025-02-18 00:00:35 +08:00
yigex	fdf04a1426	[ROCm] Add ROCm tuning config to block gemm and Re-tune for AMD Radeon Graphics (#3418 ) Co-authored-by: Bruce Xue <yigex@xilinx.com> Co-authored-by: HAI <hixiao@gmail.com>	2025-02-10 23:55:04 -08:00
Yineng Zhang	7b020cca2d	add tuning block wise fp8 (#3242 ) Co-authored-by: HandH1998 <007aabbcc411@gmail.com>	2025-02-01 03:58:18 +08:00
Ke Bao	85b2e05770	Add int8 quant kernel (#2848 )	2025-01-13 13:16:58 +08:00

14 Commits