Commit Graph

8 Commits

Author SHA1 Message Date
jianan-gu
2ab141547d [CPU] Add apply_routed_scaling_factor_on_output support for biased_grouped_topk fusion (#22413)
Co-authored-by: Ma Mingfei <mingfei.ma@intel.com>
2026-04-10 15:16:05 +08:00
Ma Mingfei
6da8f5f69e fix topk softmax performance issue (#14702) 2026-03-29 23:43:16 -07:00
jianan-gu
336dc4579e [CPU] Optimize Qwen3-next model on CPU (#12525)
Co-authored-by: Ma Mingfei <mingfei.ma@intel.com>
Co-authored-by: Fan Yin <1106310035@qq.com>
2026-01-29 22:03:58 -08:00
jianan-gu
6e6009fb6b [CPU] Fix TP padding case with weight block size (#8243) 2025-11-07 03:24:48 +08:00
Chunyuan WU
08f8f49016 [CPU][sgl-kernel] biased_grouped_topk: fix correction_bias dtype to float32 (#8212)
Co-authored-by: jianan-gu <jianan.gu@intel.com>
Co-authored-by: YanbingJiang <yanbing.jiang@intel.com>
2025-08-04 18:28:31 -07:00
YanbingJiang
fcde67b016 CPU: map changes from developing branch in sgl-kernel (#6833)
Co-authored-by: mingfeima <mingfei.ma@intel.com>
2025-06-10 01:08:15 -07:00
jianan-gu
ff00895c46 Add CPU optimized kernels for topk and rope fusions (#6456) 2025-06-02 17:37:34 -07:00
Ma Mingfei
a73c4df438 Add optimized native kernels in sgl-kernel (#5150)
Co-authored-by: Chunyuan WU <chunyuan.wu@intel.com>
Co-authored-by: YanbingJiang <yanbing.jiang@intel.com>
Co-authored-by: blzheng <beilei.zheng@intel.com>
2025-04-08 09:37:46 -07:00