Commit Graph

6 Commits

Author SHA1 Message Date
Polisetty V R K Jyothendra Varma
f0303fd07e [Intel GPU] Enable DeepSeek R1 inference on XPU (#18461)
Signed-off-by: P V R K Jyothendra Varma <polisetty.v.r.k.jyothendra.varma@intel.com>
2026-03-29 22:35:59 -07:00
cs-cat
22e378af86 Fix result writer in tuning_block_wise_kernel.py, and add FP8 kernel config for L40 (#20368)
Signed-off-by: cs-cat <118669451+cs-cat@users.noreply.github.com>
2026-03-20 09:28:54 +08:00
Xiaoyu Zhang
03b835e7d1 Refactor tuning block wise kernel and opt Qwen/Qwen3-VL-32B-Instruct-FP8 (#14141) 2025-12-08 09:24:58 +08:00
Lifu Huang
6e2da51561 Replace time.time() to time.perf_counter() for benchmarking. (#6178)
Signed-off-by: Lifu Huang <lifu.hlf@gmail.com>
2025-05-11 14:32:49 -07:00
Zhaoyi Li
c555d794f7 Minor update for ROCm variable style (#5562) 2025-04-19 23:45:27 -07:00
laixin
b0df5d240b Tuning Script for Feature DeepSeek V3/R1 INT8 Quantization (block-wise) (#3922)
Co-authored-by: sleepcoo <sleepcoo@gmail.com>
2025-02-27 10:59:46 +00:00