mirror of
https://github.com/ROCm/composable_kernel.git
synced 2026-05-03 05:01:25 +00:00
* Add example for computing LayerNorm mean and meansquare
* Refactor the pool2d_fwd example and add example for float type testing
* Revert "Add example for computing LayerNorm mean and meansquare"
This reverts commit df52e6f9d8.
* Tiny fix in pool2d_fwd_common.hpp
42 lines
1.4 KiB
Markdown
42 lines
1.4 KiB
Markdown
# Instructions for ```example_pool2d_fwd``` Examples
|
|
|
|
## Run ```example_pool2d_fwd_fp16```
|
|
```bash
|
|
#arg1: verification (0=no, 1=yes)
|
|
#arg2: initialization (0=no init, 1=single integer value, 2=scope integer value, 3=decimal value)
|
|
#arg3: time kernel (0=no, 1=yes)
|
|
#arg4 to 15: N, C, Y, X, Hi, Wi, Sy, Sx, LeftPy, LeftPx, RightPy, RightPx
|
|
./bin/example_pool2d_fwd_fp16 1 1 1
|
|
```
|
|
|
|
Result
|
|
```
|
|
in_n_c_hi_wi: dim 4, lengths {128, 192, 71, 71}, strides {967872, 1, 13632, 192}
|
|
out_n_c_ho_wo: dim 4, lengths {128, 192, 36, 36}, strides {248832, 1, 6912, 192}
|
|
launch_and_time_kernel: grid_dim {124416, 1, 1}, block_dim {64, 1, 1}
|
|
Warm up 1 time
|
|
Start running 10 times...
|
|
Perf: 0.397436 ms, 1.44252 TFlops, 783.713 GB/s
|
|
```
|
|
|
|
## Run ```example_pool2d_fwd_fp32```
|
|
```bash
|
|
#arg1: verification (0=no, 1=yes)
|
|
#arg2: initialization (0=no init, 1=single integer value, 2=scope integer value, 3=decimal value)
|
|
#arg3: time kernel (0=no, 1=yes)
|
|
#arg4 to 15: N, C, Y, X, Hi, Wi, Sy, Sx, LeftPy, LeftPx, RightPy, RightPx
|
|
./bin/example_pool2d_fwd_fp32 1 1 1
|
|
```
|
|
|
|
|
|
Result
|
|
```
|
|
./bin/example_pool2d_fwd_fp32 1 1 1
|
|
in_n_c_hi_wi: dim 4, lengths {128, 192, 71, 71}, strides {967872, 1, 13632, 192}
|
|
out_n_c_ho_wo: dim 4, lengths {128, 192, 36, 36}, strides {248832, 1, 6912, 192}
|
|
launch_and_time_kernel: grid_dim {124416, 1, 1}, block_dim {64, 1, 1}
|
|
Warm up 1 time
|
|
Start running 10 times...
|
|
Perf: 1.01823 ms, 0.563045 TFlops, 611.8 GB/s
|
|
```
|