enable type conversion in ThreadwiseGenericTensorSliceCopy_v2r1 and BlockwiseGenericTensorSliceCopy_v2 [ROCm/composable_kernel commit: 9aaeacc82b]
9aaeacc82b