Ville Pietilä
48d22d2b9b
Remove the obsolete template parameters.
2025-10-03 14:36:48 +00:00
Ville Pietilä
c3f0c1a866
Add additional check for non-supported c > 1 case.
2025-09-30 07:46:24 +00:00
Ville Pietilä
db835e065c
Make MPerGroup and NPerGroup template parameters.
2025-09-30 07:14:28 +00:00
Ville Pietilä
1a6f602c65
Remove debug code.
2025-09-30 05:53:28 +00:00
Ville Pietilä
193907fd85
Fix case k > 1 and c=1.
2025-09-29 16:02:00 +00:00
Ville Pietilä
1764c77fb2
Enable running multiple GEMM batches of merged conv groups.
2025-09-26 07:51:29 +00:00
Ville Pietilä
0ea3268d5d
Remove debug and other dead code.
2025-09-25 09:41:33 +00:00
Ville Pietilä
cc7433efc6
Add more comments, disable debug code.
2025-09-25 09:37:15 +00:00
Ville Pietilä
97f842f2c6
Fully functional LDS to global mem transfer using tensor descriptor and tile distribution encoding.
2025-09-25 09:30:50 +00:00
Ville Pietilä
625a78b17b
WIP: LDS to global mem transfer using CK tile tensor descriptor and tile distribution encoding.
2025-09-24 15:08:01 +00:00
Ville Pietilä
8048d6ff73
Fix build.
2025-09-23 11:17:08 +00:00
Ville Pietilä
e6f6c4a6a3
Working baseline for depthwise covolution with merged conv groups.
2025-09-23 11:14:10 +00:00
Ville Pietilä
d7da3d5089
Offset fixes.
2025-09-22 15:37:46 +00:00
Ville Pietilä
7f52f84167
Fix tile window size for c block.
2025-09-19 08:08:19 +00:00
Ville Pietilä
6bcdb0947e
LDS to global memory copy.
2025-09-18 14:59:32 +00:00
Ville Pietilä
4ec81cb95c
Add more logging.
2025-09-17 12:27:51 +00:00
Ville Pietilä
6d318ab481
Enable running multiple conv groups per batch.
2025-09-12 14:03:04 +00:00
Ville Pietilä
0d5c1b9638
WIP: Merged conv groups epilogue.
2025-09-11 15:24:36 +00:00
Ville Pietilä
970b40aa6c
WIP: Merged conv groups offset calculation.
2025-09-09 11:33:31 +00:00
Ville Pietilä
61b3c96273
Add number of groups to merge to ck tile grouped gemm example.
2025-09-04 14:24:23 +00:00
Ville Pietilä
2b1908a375
Fix compilation of the grouped conv examples.
2025-09-04 12:01:49 +00:00
linqunAMD
4a49dac7c6
[Regression] Fix CK_TILE build error in grouped_convolution, copy_basic and fused_moegemm_kernel ( #2728 )
...
* fix copy basic build error
* fix other ck tile test build error
2025-08-28 20:30:30 +08:00
Bartłomiej Kocot
4212bbc170
[CK Tile] Grouped convolution backward data ( #2652 )
...
* base working version for single groupped conv bwd data
* Fix 2d descriptor
* fix groups
* Add 3d support
* fixes
* fixes
* fixes
---------
Co-authored-by: Jakub Piasecki <jakpia21@gmail.com >
2025-08-20 05:29:57 -07:00
linqunAMD
9fcc1ee9fd
Support Wave32 in CK_TILE - Part 1 ( #2594 )
...
* Support wave32/wave64 in CK_TILE - Part 1
* remove blocksize in kernel launch
* fix build error
* fix clang format
* fix clang format 2
* fix clang format 3
* fix fmha build error
* fix fmha build 2
* fix fmha build 3
* fix build error 4
* address review comment
* update change log
* replace KernelBlockSize with kBlockSize
* fix CI fail
* fix clang format
* address review comment and rebase code.
* fix universal test fail
---------
Co-authored-by: Lin, Qun <Quentin.Lin+amdeng@amd.com >
Co-authored-by: Thomas Ning <Thomas.Ning@amd.com >
2025-08-18 10:08:31 -07:00
Illia Silin
504b101da3
upgrade from clang-format-12 to clang-format-18 ( #2568 )
...
* upgrade to clang-format-18
* update to clang-format-18 in pre-commit-config
2025-07-28 11:34:07 -07:00
jakpiase
6681593864
[CK_TILE] Grouped Convolution Backward Weight Kernel ( #2357 )
...
* [CK TILE] Grouped Convolution Forward Kernel
* custom vector size
* fixes
* refactor
* resolved conflicts
* rebase fixes
* fixes
* tmp
* add working support for splitk
* minor fix
* fixes
* fixes
* minor fix
* small fix
* Split K and preprocessing fixes
---------
Co-authored-by: Bartlomiej Kocot <barkocot@amd.com >
2025-07-24 10:41:35 +02:00
Bartłomiej Kocot
cebdee4d9e
[CK TILE] Grouped Convolution Forward Kernel ( #2188 )
...
* [CK TILE] Grouped Convolution Forward Kernel
* custom vector size
* fixes
* refactor
* rebase fixes
* fixes
* fixes
2025-06-20 15:44:36 -07:00