Commit Graph

86 Commits

Author SHA1 Message Date
Philip Maybank
42b2e3bc40 change Atrribute to Attribute globally 2025-07-28 15:43:08 -04:00
Clement Lin
c1d4adce86 Change the warmup number of add example 2025-07-28 14:54:51 -04:00
AviralGoelAMD
511f5eb24e clang format 2025-07-28 14:54:51 -04:00
AviralGoelAMD
67cb075ba4 fixed function and struct names 2025-07-28 14:54:51 -04:00
AviralGoelAMD
35ecfc1b5a clang format 2025-07-28 14:54:51 -04:00
AviralGoelAMD
1056a980bc commented the 01_add example 2025-07-28 14:54:51 -04:00
AviralGoelAMD
b1bada8304 added explanation comments 2025-07-28 14:54:51 -04:00
AviralGoelAMD
9edef9e351 added a 1d vector elementwise example 2025-07-28 14:54:51 -04:00
bobofang11235
cf34ed5f3d Add cache aware option 2025-07-28 14:54:51 -04:00
ClementLinCF
4bfa4f6839 Update README.md 2025-07-28 14:54:51 -04:00
MHYang
3357f3541a Add QK swizzle option 2025-07-28 14:54:51 -04:00
MHYang
9e9a63c918 Fix o_spans 2025-07-28 14:54:51 -04:00
ClementLinCF
be9516a756 Update README.md 2025-07-28 14:54:51 -04:00
Clement Lin
43cd04aa47 Fix typo 2025-07-28 14:54:51 -04:00
Clement Lin
4b242084ab Add new instances 2025-07-28 14:54:51 -04:00
MHYang
62452bd550 Fix unexpected errors 2025-07-28 14:54:51 -04:00
MHYang
10e08b520d Remove unused flag 2025-07-28 14:54:51 -04:00
MHYang
54ddd4c47c Fix generate.py 2025-07-28 14:54:51 -04:00
MHYang
8f939db88f Fix clang-format 2025-07-28 14:54:51 -04:00
MHYang
fcdfbcb6a7 Implement prefetch and instruction schedule 2025-07-28 14:54:51 -04:00
Clement Lin
4fccb261b8 Add the warp gemm option 2025-07-28 14:54:51 -04:00
Clement Lin
ac5b7cbf63 Add more warp gemm policies for FA 2025-07-28 14:54:51 -04:00
Clement Lin
41d6c5731e Remove unused code 2025-07-28 14:54:51 -04:00
YC Lin
b52a27d8b7 [GEMM] Add define macro for unused a/b blk window 2025-07-28 14:54:51 -04:00
Clement Lin
addf290f8e Add codegen instances
The following examples have been tested for 04_codegen:

./bin/codegen_basic_flash_attention_fwd 1 1 64 4096 4096 256 256
./bin/codegen_basic_flash_attention_fwd 1 1 64 4096 4096 64 64
./bin/codegen_basic_flash_attention_fwd 1 1 64 4096 4096 32 32
./bin/codegen_basic_flash_attention_fwd 1 1 64 4096 4096 128 128
./bin/codegen_basic_flash_attention_fwd 1 1 64 2048 2048 128 128
./bin/codegen_basic_flash_attention_fwd 1 1 64 512 512 128 128
2025-07-28 14:54:51 -04:00
BoboFang
9262b002f5 Add MakeBlock2TileMap in 04_codegen_flash_attention_fwd 2025-07-28 14:54:51 -04:00
BoboFang
6da6d4af91 Change the permission of FA CMakeList.txt to 644 2025-07-28 14:54:51 -04:00
BoboFang
104572c1f3 Fix error after clang-format 2025-07-28 14:54:51 -04:00
BoboFang
aff041233d Run clang-format in toy_example 2025-07-28 14:54:51 -04:00
MHYang
0d8693776e Initialize instruction schedule 2025-07-28 14:54:51 -04:00
BoboFang
879edeadf1 Add cache-aware in flash attention 2025-07-28 14:54:51 -04:00
MHYang
fef9588f98 Merge fix for bank conflict into codegen FA 2025-07-28 14:54:51 -04:00
MHYang
ba264fe432 Fix bank conflict 2025-07-28 14:54:51 -04:00
YC Lin
1f5398de4a [GEMM] Fix bWarpTile issue and remove redundant pipeline in BlockGemmPipeline 2025-07-28 14:54:51 -04:00
MHYang
4a264eb9ed Fix register spilling and K0 tile size issues 2025-07-28 14:54:51 -04:00
YC Lin
eb737b8f82 [GEMM] Fix num_loop issues 2025-07-28 14:54:51 -04:00
Clement Lin
f71b2c7e55 Add generate.py for codegen 2025-07-28 14:54:51 -04:00
YC Lin
9e020272c4 [GEMM] Remove redundant GetBlockGemm 2025-07-28 14:54:51 -04:00
YC Lin
4e6a792b82 [GEMM] Implement local prefetch and refactor block gemm pipeline 2025-07-28 14:54:51 -04:00
Clement Lin
cbc660acc7 Refactor flash_attention_fwd_traits_ for codegen 2025-07-28 14:54:51 -04:00
YC Lin
f30071289c [GEMM] Merge universal_block_gemm into block_gemm 2025-07-28 14:54:51 -04:00
mhYang
26b73c0ed1 Fix flash attention 1 tile case 2025-07-28 14:54:51 -04:00
Clement Lin
e614acfdd8 Refactor FlashAttnArgs usage for codegen 2025-07-28 14:54:51 -04:00
Clement Lin
0cc5130818 Add codegen test example 2025-07-28 14:54:51 -04:00
YC Lin
5b1c397806 [GEMM] Refactor GetStaticLdsSize and remove GetSmemSize 2025-07-28 14:54:51 -04:00
Clement Lin
2499b8d401 Fix indentation 2025-07-28 14:54:51 -04:00
Clement Lin
d98fb3e0b5 Remove unused code 2025-07-28 14:54:51 -04:00
YC Lin
ae275aa105 [GEMM] Refactor block gemm, pipeline, and policy of instruction schedule opt 2025-07-28 14:54:51 -04:00
YC Lin
6113ca8062 [Add] Add build option for generating assembly 2025-07-28 14:54:51 -04:00
YC Lin
97a960042b [GEMM] Refactor block gemm and pipeline policy of instruction schedule 2025-07-28 14:54:51 -04:00