[CK][CK Tile] Improve access for merged groups and remove
modulo from xor (#5334)
## Motivation
[CK][CK Tile] Improve access for merged groups and remove modulo from
xor
## Technical Details
- add template parameter to xor if modulo is needed. We don't need
modulo for merged groups
- use access by m for merged groups for a tensor
-
## Test Plan
test_grouped_convnd_fwd_tile
## Test Result
passed locally
## Submission Checklist
- [x] Look over the contributing guidelines at
https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.