mirror of
https://github.com/ROCm/composable_kernel.git
synced 2026-05-14 10:09:41 +00:00
* refactor reduce kernel
- Rename Reduce kernel as per convention
- Move kept_dim and reduce_dims from runtime to compile-time parameters
- Update Reduce2dProblem template to include KeptDim, ReduceDims, and
Rank
- Remove IsSupportedArgument validation function as it's unnecessary.
Not using the GuaranteedLastDimensionVectorStride while making tensor
view or descriptor which removes the bounds enforced earlier. We still
calculate and use vector size.
- Update reduce example to demonstrate NCHW->NHW reduction with
non-contiguous support
- Update tests
Kernel now handles both contiguous and non-contiguous memory layout.
* fix compile errors
[ROCm/composable_kernel commit: ea10a78203]
14 KiB
14 KiB