mirror of
https://github.com/ROCm/composable_kernel.git
synced 2026-04-20 06:49:15 +00:00
Adding remaining conv, dynamic_op, and scaleadd_scaleadd_relu flavors for grouped conv fwd (#3529)
* Adding remaining flavors for grouped conv fwd As titled. Following variants are added: - grouped_conv2d_fwd_dynamic_op - grouped_conv3d_fwd_dynamic_op - grouped_conv3d_fwd_bilinear - grouped_conv3d_fwd_convscale - grouped_conv3d_fwd_convinvscale - grouped_conv3d_fwd_convscale_add - grouped_conv3d_fwd_convscale_relu - grouped_conv3d_fwd_scale - grouped_conv3d_fwd_combconvscale - grouped_conv3d_fwd_scaleadd_scaleadd_relu * Fix incomplete parsing of types from source names in add_instance_library() cmakelists function so we don't build f8 on RDNA3. * Do not build f8 / bf8 only flavor tests on RDNA3 * Make sure we have proper generic instances for all instance lists related to the post-ces extra flavors, with scalarPerVector = 1. Then disable all but one generic instance per instance list to reduce compile time. * Post rebase fix: Template parameters for Grouped Conv Fwd Device Impl got tweaked upstream. * adding int8 and fp16 overloads to the elementwise operations * fixed copilot nits * Addressing review comments: - removed unnecessary examples for dynamic op - removed unnecessary conv specalizations for all the flavors - removed spurious bilinear and scale source files * clang-format * reduced no of tests --------- Co-authored-by: Wojciech Laskowski <wojciech.laskowski@streamhpc.com>
This commit is contained in:
committed by
GitHub
parent
6a6177a246
commit
2377a62837
@@ -293,6 +293,7 @@ struct ThreadwiseTensorSliceTransfer_v7r3
|
||||
// convolution forward. For some reason for that specific type there is an ambiguity
|
||||
// in the type resolution for the ternary expression. I added an explicit cast to
|
||||
// disambiguate and only use it for f8 just in case it affects performance.
|
||||
// TODO: Add same exception for ck::f8_fnuz_t?
|
||||
if constexpr(is_same_v<scalar_t, ck::f8_ocp_t>)
|
||||
{
|
||||
elm_vectors(i).template AsType<elm_vector_t>()(I0) =
|
||||
|
||||
Reference in New Issue
Block a user