Files
composable_kernel/include/ck/tensor_operation/gpu/device
Jianfeng Yan cb87b049de refactored deviceBatchedGemm; removed GridwiseBatchedGemm; added fp32 and int8 to profiler (#120)
changed long_index_t to index_t when computing memory offset

uncomment other ops in profiler

added test for batched_gemm
2022-03-21 16:45:14 -05:00
..
2022-03-08 21:46:36 -06:00
2022-03-10 10:14:43 -06:00
2022-03-10 10:14:43 -06:00