damien-lejeune
24d3cbc30d
[CK Tile] multi reduce improvements (#3607)
* WIP: refactoring
* Swap operation/data nested loops order
* Improve memory coalescing
* Add comments
* Enforce same identity element for the reduce operations
* Re-add compile time constant
* Comment + re-add __builtin_amdgcn_readfirstlane(0) to the loop init
---------
Co-authored-by: Damien Lejeune <damien.lejeune@amd.com>
[ROCm/composable_kernel commit: 91e32f305f]
2026-01-27 12:56:09 -08:00
..
2025-11-28 13:49:54 -08:00
2026-01-13 09:21:29 -08:00
2026-01-04 03:28:14 -08:00
2025-11-28 13:49:54 -08:00
2025-11-28 13:49:54 -08:00
2025-12-10 22:50:43 -08:00
2025-11-28 13:49:54 -08:00
2025-11-28 13:49:54 -08:00
2026-01-04 03:28:14 -08:00
2026-01-14 21:32:06 +08:00
2026-01-26 10:23:26 -08:00
2026-01-26 11:27:42 -08:00
2026-01-04 03:28:14 -08:00
2026-01-04 03:28:14 -08:00
2026-01-20 10:37:09 -08:00
2026-01-23 16:14:22 -07:00
2026-01-22 12:53:52 -08:00
2025-12-02 11:33:33 -08:00
2026-01-04 03:28:14 -08:00
2026-01-04 03:28:14 -08:00
2026-01-04 03:28:14 -08:00
2026-01-06 12:36:04 -08:00
2026-01-04 03:28:14 -08:00
2026-01-04 03:28:14 -08:00
2026-01-04 03:28:14 -08:00
2025-11-28 13:49:54 -08:00
2025-11-28 13:49:54 -08:00
2026-01-13 09:21:29 -08:00
2025-11-28 13:49:54 -08:00
2026-01-02 22:16:41 -07:00
2025-11-28 13:49:54 -08:00
2025-11-28 13:49:54 -08:00
2026-01-27 12:56:09 -08:00
2025-11-28 13:49:54 -08:00
2025-11-28 13:49:54 -08:00
2025-11-28 13:49:54 -08:00
2025-11-28 13:49:54 -08:00
2026-01-02 22:16:41 -07:00
2025-12-11 08:25:29 -08:00
2026-01-22 12:53:52 -08:00