Chao Liu
|
1c962a13ee
|
device_implicit_gemm_convolution_1_chwn_csrk_khwn: use tensor copy (instead of pointwise) for writing output, 3x3 increased from 78% to 84%, 5x5 from 80% to 84%
[ROCm/composable_kernel commit: a65ef90308]
|
2019-02-19 11:47:46 -06:00 |
|