[CK_TILE] Pooling FWD (Lwpck 3683) (#2956)

* Pooling 2D/3D with refernce

* Tests & cleanup

- added test for ppoling
- cleanup
- removed 2d example

* Comment resolution

- README added
- example target name rectified
- appropriate arg description and comments added

* clang-format

* appropriate blocksize calc

* modifications for future indexing addition

- instead of transforming views we now transform the descriptors, so
that the same descriptor can be re-used for index tensor in the future

* some basic fixes

* comment resolutions

* comment resolutions

---------

Co-authored-by: Illia Silin <98187287+illsilin@users.noreply.github.com>
This commit is contained in:
Yashvardhan Agarwal
2025-10-09 17:13:26 +03:00
committed by GitHub
parent 9d4bfe3932
commit 7b6451b68e
14 changed files with 1317 additions and 0 deletions

View File

@@ -0,0 +1,33 @@
// SPDX-License-Identifier: MIT
// Copyright (c) 2025, Advanced Micro Devices, Inc. All rights reserved.
#pragma once
#include "ck_tile/core.hpp"
namespace ck_tile {
template <typename InDataType_,
typename OutDataType_,
typename ComputeDataType_,
typename IndexDataType_,
typename ReduceOp_,
bool OutputIndex_,
bool PropagateNan_,
typename BlockShape_>
struct PoolProblem
{
using InDataType = remove_cvref_t<InDataType_>;
using OutDataType = remove_cvref_t<OutDataType_>;
using ComputeDataType = remove_cvref_t<ComputeDataType_>;
using IndexDataType = remove_cvref_t<IndexDataType_>;
using BlockShape = remove_cvref_t<BlockShape_>;
using ReduceOp = ReduceOp_;
using OutputIndex = bool_constant<OutputIndex_>;
using PropagateNan = bool_constant<PropagateNan_>;
static constexpr bool kNeedCrossLaneSync = BlockShape::ThreadPerWarp_N > 1;
static constexpr bool kNeedCrossWarpSync = BlockShape::WarpPerBlock_N > 1;
};
} // namespace ck_tile