mirror of
https://github.com/ROCm/composable_kernel.git
synced 2026-05-02 04:31:25 +00:00
Add new MFMA 16x16x16x2 example for GEMM with PADDING_K_FIRST optimization
- Introduced new subdirectory for MFMA 16x16x16x2 implementation. - Added CMake configuration and source files for the new example. - Implemented block GEMM and pipeline strategies to optimize performance. - Included necessary policies and tensor distribution for efficient memory access. - Updated the main GEMM kernel to support the new configuration.
This commit is contained in:
@@ -8,3 +8,5 @@ include_directories(AFTER
|
||||
add_subdirectory(01_naive_gemm)
|
||||
add_subdirectory(02_padding_k_first)
|
||||
add_subdirectory(03_mfma_16x16x16)
|
||||
add_subdirectory(04_mfma_16x16x16x2)
|
||||
add_subdirectory(05_xor_bank_conflict_free)
|
||||
Reference in New Issue
Block a user