composable_kernel/device_operation/include/gemm_specialization.hpp at c254e5abd2b01b9d5a2ba3fe4531e178623396d0 - composable_kernel - Public git mirror

ROCm/composable_kernel

mirror of https://github.com/ROCm/composable_kernel.git synced 2026-05-05 14:11:29 +00:00

Files

zjing14 20a672d0b8 Add small tile size for fp16/fp32 and NN layout (#80 )

* add DeviceGemmSplitKXdl

* add file device_gemm_splitk_xdl.hpp

* set c matrix zero

* using atomic

* add all tuning parameter to f32 mkkn

* grid size change to 720

* add tunning parameter for NT

* add tunning parameter for TN

* add tunning parameter for TT

* add m=96tunning parameter

* add lost config

* debug

* fix sweep

* add failed tuning params

* fixed sweep logic

* clean

* add padding to M/N for irr tile size

* clean code

* add element wise operation

* fixed MPerBlock=96

* remove marco for slpitk swtich

* add test

* add new line at the end of device_gemm_xdl_instance.hpp

* remove step hack

* seperate split-k instance files

* add tunning parameters

* change disired grid size to parameters

* remove slice length

* add desiredgridsize parameter to ckProfiler

* add losting file device_gemm_xdl_splitk_instance.hpp

* change desired gride size to kbatch

* format

* format

* clean up

* add selection of device_instances

* clean code

* clean code

* add small tile size in fp16 nn

* test for rocm 4.5

* merge develop

* clean

* clean

* clean

* remove no-use code

* add padding switch to device_gemm_xdl

* add padding switch for ksplit fp32

* clean

* clean

* add files

* rename

* Update profiler.cpp

* format

Co-authored-by: ltqin <letaoqin@amd.com>
Co-authored-by: ltqin <letao.qin@amd.com>
Co-authored-by: Chao Liu <chao.liu2@amd.com>

2022-02-11 15:49:06 -06:00

18 lines

260 B

C++

Raw Blame History

 #ifndef GEMM_SPECIALIZATION
 #define GEMM_SPECIALIZATION
 namespace ck {
 namespace tensor_operation {
 namespace device {
 enum GemmSpecialization_t
 {
     Default,
     MNPadding,
 };
 } // namespace device
 } // namespace tensor_operation
 } // namespace ck
 #endif