mirror of
https://github.com/ROCm/composable_kernel.git
synced 2026-03-17 21:57:38 +00:00
* WIP: refactoring * Swap operation/data nested loops order * Improve memory coalescing * Add comments * Enforce same identity element for the reduce operations * Re-add compile time constant * Comment + re-add __builtin_amdgcn_readfirstlane(0) to the loop init --------- Co-authored-by: Damien Lejeune <damien.lejeune@amd.com>