* Split-K autodeduction for DeviceGroupedConvBwdWeight_Xdl_CShuffle and DeviceGroupedConvBwdWeight_Xdl_CShuffleV3.
* Split-K autodeduction for DeviceGroupedConvBwdWeightTwoStage_Xdl_CShuffle.
* Use simple best occupancy model to calculate the split-K.
* Handle split-K autodeduction in explicit gemm conv.
* Add unit tests for split-K autodeduction.
* Remove oversubscription.
* Small fixes.
* Added split-K autodeduction for DeviceGroupedConvBwdWeightMultipleD_Xdl_CShuffle.
* Run clang formatting.
* Fix error handling in the conv profiler.
* Add missing documentation for the autodeducted split-K values.
* Add split-K autodeduction to DeviceGroupedConvBwdWeight_Explicit_Xdl solver.
* Fix clang formatting and split-K profiler documentation.
* Rename max_occupancy value variable.
* Calculate grid size for split-K autodeduction directly from input array shapes and template params.
---------
Co-authored-by: Ville Pietilä <>