Chunyuan WU
|
6c89214584
|
[CPU][sgl-kernel] extend_attention_cpu and flash_attn_varlen_func: fix nan for large seq (#22434)
Co-authored-by: Ma Mingfei <mingfei.ma@intel.com>
|
2026-04-17 13:01:01 +08:00 |
|
Ma Mingfei
|
88f7759402
|
[CPU] optimize flash_attn_varlen_func (#15708)
|
2026-01-29 22:07:05 -08:00 |
|
Xuan Liao
|
c233e9d7a9
|
[CPU] Support chunk_gated_delta_rule kernel for Qwen3-Next (#12441)
|
2025-12-03 17:03:48 +08:00 |
|
YanbingJiang
|
b044400dd3
|
Support non-contiguous query input for extend/decode attention (#7462)
|
2025-07-02 19:59:45 -07:00 |
|
YanbingJiang
|
fcde67b016
|
CPU: map changes from developing branch in sgl-kernel (#6833)
Co-authored-by: mingfeima <mingfei.ma@intel.com>
|
2025-06-10 01:08:15 -07:00 |
|
applesaucethebun
|
2ce8793519
|
Add typo checker in pre-commit (#6179)
Co-authored-by: Brayden Zhong <b8zhong@uwaterloo.ca>
|
2025-05-11 12:55:00 +08:00 |
|
Ma Mingfei
|
a73c4df438
|
Add optimized native kernels in sgl-kernel (#5150)
Co-authored-by: Chunyuan WU <chunyuan.wu@intel.com>
Co-authored-by: YanbingJiang <yanbing.jiang@intel.com>
Co-authored-by: blzheng <beilei.zheng@intel.com>
|
2025-04-08 09:37:46 -07:00 |
|