Aviral Goel
d85f065b15
chore(copyright): update copyright header for example directory ( #3273 )
...
* chore(copyright): update copyright header for codegen directory
* chore(copyright): update copyright header for example directory
2025-11-24 18:02:41 -08:00
linqunAMD
9fcc1ee9fd
Support Wave32 in CK_TILE - Part 1 ( #2594 )
...
* Support wave32/wave64 in CK_TILE - Part 1
* remove blocksize in kernel launch
* fix build error
* fix clang format
* fix clang format 2
* fix clang format 3
* fix fmha build error
* fix fmha build 2
* fix fmha build 3
* fix build error 4
* address review comment
* update change log
* replace KernelBlockSize with kBlockSize
* fix CI fail
* fix clang format
* address review comment and rebase code.
* fix universal test fail
---------
Co-authored-by: Lin, Qun <Quentin.Lin+amdeng@amd.com >
Co-authored-by: Thomas Ning <Thomas.Ning@amd.com >
2025-08-18 10:08:31 -07:00
Bartłomiej Kocot
5741edf761
Fix clang format ( #2567 )
...
* clean
* clang format fix
2025-07-25 09:54:34 -07:00
rahjain-amd
78082855d8
Fixing 0ms and inf GB/s issue in img2col ( #2565 )
...
issue :
====
``` sh
$ bin/tile_example_img2col
Perf: 0 ms, inf GB/s
```
solution :
======
Problem occured because config.time_kernel is false by default.
if false, then no need to calculate perf, just print proper message
`image_to_coloumn: pass, No Perf generated due to config.time_kernel=0`
2025-07-25 21:15:50 +05:30
Bartłomiej Kocot
de3e3b6424
[CK_TILE] Image to Column kernel ( #1532 )
...
* [CK_TILE] Image to Column kernel
* Fixes
* Vector loads and stores
* Fixes
* Fixes
* change test dir name
2024-09-27 22:57:38 +02:00