Files
composable_kernel/tutorial/ck_tile
AviralGoelAMD 2e3a716d72 Add new MFMA 16x16x16x2 example for GEMM with PADDING_K_FIRST optimization
- Introduced new subdirectory for MFMA 16x16x16x2 implementation.
- Added CMake configuration and source files for the new example.
- Implemented block GEMM and pipeline strategies to optimize performance.
- Included necessary policies and tensor distribution for efficient memory access.
- Updated the main GEMM kernel to support the new configuration.
2026-02-03 23:06:07 +00:00
..