diff --git a/example/ck_tile/36_pooling/README.md b/example/ck_tile/36_pooling/README.md
index ab49b57095..4417e03734 100644
--- a/example/ck_tile/36_pooling/README.md
+++ b/example/ck_tile/36_pooling/README.md
@@ -2,6 +2,116 @@
This folder contains example for the pooling operator using ck_tile tile-programming implementation. Currently the pooling kernel only supports 2D and 3D pooling.
+## Tensor Descriptor Transformations
+
+The pooling kernel transforms the input tensor into 2D format suitable for reduction. This section explains the transformation pipeline for both 2D and 3D pooling operations.
+
+### 3D Pooling Transformations
+
+For 3D pooling, the input tensor has shape `(N, D, H, W, C)` where:
+- `N`: batch size
+- `D`: depth dimension
+- `H`: height dimension
+- `W`: width dimension
+- `C`: channel dimension
+
+The transformations convert this 5D tensor into a 2D tensor where rows represent output positions (M) and columns represent pooling window elements (K).
+
+```mermaid
+graph TD
+ %% Input Tensor: (N, D, H, W, C)
+ Input["Input Tensor
(N, D, H, W, C)"]
+ style Input fill:#e1f5fe
+
+ %% Pass-through N dimension
+ PassN["Pass-through N
(batch size)"]
+ style PassN fill:#f3e5f5
+ Input --> PassN
+
+ %% Pad spatial dimensions
+ PadD["Pad D
(depth with left/right padding)"]
+ style PadD fill:#fff9c4
+ Input --> PadD
+
+ PadH["Pad H
(height with left/right padding)"]
+ style PadH fill:#fff9c4
+ Input --> PadH
+
+ PadW["Pad W
(width with left/right padding)"]
+ style PadW fill:#fff9c4
+ Input --> PadW
+
+ %% Pass-through C dimension
+ PassC["Pass-through C
(channels)"]
+ style PassC fill:#f3e5f5
+ Input --> PassC
+
+ %% Embed sliding windows
+ EmbedD["Embed D
window(Z) × output_positions(Dₒ)"]
+ style EmbedD fill:#fff3e0
+ PadD --> EmbedD
+
+ EmbedH["Embed H
window(Y) × output_positions(Hₒ)"]
+ style EmbedH fill:#fff3e0
+ PadH --> EmbedH
+
+ EmbedW["Embed W
window(X) × output_positions(Wₒ)"]
+ style EmbedW fill:#fff3e0
+ PadW --> EmbedW
+
+ %% Merge into 2D matrix
+ MergeM["Merge M
(N, Dₒ, Hₒ, Wₒ, C)
→ output positions"]
+ style MergeM fill:#e8f5e9
+ PassN --> MergeM
+ EmbedD --> MergeM
+ EmbedH --> MergeM
+ EmbedW --> MergeM
+ PassC --> MergeM
+
+ MergeK["Merge K
(Z, Y, X)
→ window elements"]
+ style MergeK fill:#e8f5e9
+ EmbedD --> MergeK
+ EmbedH --> MergeK
+ EmbedW --> MergeK
+
+ %% Final padding for block alignment
+ PadM["Right-pad M
(for block alignment)"]
+ style PadM fill:#fff9c4
+ MergeM --> PadM
+
+ PadK["Right-pad K
(for block alignment)"]
+ style PadK fill:#fff9c4
+ MergeK --> PadK
+
+ %% Result
+ Result["2D Matrix
(M × K)"]
+ style Result fill:#c8e6c9
+ PadM --> Result
+ PadK --> Result
+```
+
+**Transformation Steps:**
+1. **Padding**: Apply left and right padding to spatial dimensions (D, H, W) to handle boundary conditions
+2. **Sliding Windows**: Use embed transforms to create sliding windows across each spatial dimension, expanding each dimension into (window_size, output_positions)
+3. **Reshaping**: Merge all dimensions into a 2D matrix where:
+ - M dimension = N × Dₒ × Hₒ × Wₒ × C (total output positions)
+ - K dimension = Z × Y × X (elements per pooling window)
+4. **Block Alignment**: Apply right padding to ensure M and K dimensions are aligned to block size
+
+### 2D Pooling Transformations
+
+2D pooling follows the same transformation pipeline but operates on 4D tensors with shape `(N, H, W, C)`. The process is identical except:
+- Only H and W dimensions are padded and embedded
+- K dimension merges only (Y, X) window elements
+- M dimension merges (N, Hₒ, Wₒ, C)
+
+### Output Tensor Transformations
+
+The output tensor transformations are simpler:
+- Merge all output dimensions (N, Dₒ/Hₒ, Wₒ, C) into a single M dimension
+- Apply right padding for block alignment
+- The result is a 1D tensor that maps directly to the M dimension of the computation matrix
+
## build
```
# in the root of ck_tile