* Fix transform and instances for grouped conv bwd data
* Add instances for small K and small C
* Remove workaround after fix
* Fix interface tests
[ROCm/composable_kernel commit: 595d23be14]
* Support bf16/f32/f16 and NHWGC conv2d_bwd_data
* Add interface test
* clang format
* Comment fixes
* Add more friendly error message
[ROCm/composable_kernel commit: 63388e84ab]