Logo
Explore Help
Register Sign In
ROCm/composable_kernel
1
0
Fork 0
You've already forked composable_kernel
mirror of https://github.com/ROCm/composable_kernel.git synced 2026-05-24 06:44:36 +00:00
Code Issues Packages Projects Releases Wiki Activity
2,429 Commits 754 Branches 38 Tags
7d10353fda730fdf95c7b47d9a51faa6a22c1942
Commit Graph

10 Commits

Author SHA1 Message Date
Qianfeng Zhang
fb09061b0c Add norm_dist parameter for hstu example to select either normal or uniform distribution to initialize data 2025-08-12 03:06:35 +00:00
Qianfeng Zhang
1404336200 Update HstuBlockMaskWithLocal::GetTileRangeAlongX, add comments and test cases for causal == false 2025-08-10 06:10:14 +00:00
Qianfeng Zhang
971d0d98d4 Update to support min_full_attn_seqlen be bigger than max_uih_len 2025-08-08 09:26:55 +00:00
Qianfeng Zhang
f27d8cefb7 Add attn_scale MakeKargs() parameter support and update in example, reference codes 2025-08-03 03:37:28 +00:00
Qianfeng Zhang
3483af0516 Fix added case in test_hstu_attention.sh 2025-07-25 15:12:05 +00:00
Qianfeng Zhang
29d3dc9662 Update in GetTileRangeAlongX to consider for non-causal+local_size>0 situation and add test case to test_hstu_attention.sh 2025-07-25 14:56:13 +00:00
Qianfeng Zhang
43a97681b8 Add three scripts for verification of jagged causal cases 2025-07-25 11:20:46 +00:00
Qianfeng Zhang
c87a217475 Update to test_ck_hstu_mask.sh and test_pytorch_hstu_mask.py to align their testings 2025-06-22 16:26:53 +00:00
Qianfeng Zhang
09ac14604c Align the -seqlens=xxx in the mattn0_full0 and mattn256_full256 scripts with the required benchmarks 2025-06-18 16:02:04 +00:00
Qianfeng Zhang
9e6a24010a Move all test and bench scripts to folder scripts 2025-06-06 08:22:38 +00:00
Powered by Gitea Version: 1.25.4 Page: 180ms Template: 5ms
English
Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API