* add example
* fix example
* add instance for gemm permute
* add to client example
* change configs
* change instance file name
* formate
* change client example file name and remove example
[ROCm/composable_kernel commit: 55236709e2]
We can use this template to eliminate duplicated iterator computing
logics. By providing return type to ck::accumulate_n(), we can avoid
type conversion operations.
[ROCm/composable_kernel commit: 730204eed0]