* Add wei_strides to grouped conv3d wei to keep consistency
* Fix strides in client examples
* Unify backward weight api with forward
* Fix for example
* Fixes for examples
---------
Co-authored-by: zjing14 <zhangjing14@gmail.com>
[ROCm/composable_kernel commit: 22443f7aae]