Bartłomiej Kocot
700b2ec9c0
Update AMD buffer coherency ( #3403 )
...
* Update AMD buffer coherency [AICK-421]
* fixes
* fix
* fixes
* fixes
* Add backward compatilibity
* fix
* fixes
* fix
* fix
* fix
* Update grouped_convolution_backward_weight_kernel.hpp
2025-12-18 10:16:22 +01:00
Aviral Goel
de6466481f
chore(copyright): update copyright header for include directory ( #3293 )
2025-11-26 11:00:05 -07:00
Michal Kulikowski
cd8af997e6
[CK] s_prefetch unit test fixes.
...
Signed-off-by: Michal Kulikowski <Michal.Kulikowski@amd.com >
2025-11-19 21:54:50 +01:00
Michal Kulikowski
f3ef7acca0
[CK] Added s_prefetch unit test.
...
-added s_buffer_load_b32/64 assembly
-added amd_s_buffer_load_impl
Signed-off-by: Michal Kulikowski <Michal.Kulikowski@amd.com >
2025-11-19 21:54:50 +01:00
Illia Silin
2d8a804152
Fix direct lds load for gfx950 and clang20 ( #2346 )
...
* fix direct lds load for gfx950 and clang20
* Update include/ck/utility/amd_buffer_addressing_builtins.hpp
* Fix format
---------
Co-authored-by: Aviral Goel <aviral.goel@amd.com >
Co-authored-by: Andriy Roshchenko <andriy.roshchenko@amd.com >
2025-06-15 15:22:34 -07:00
Andriy Roshchenko
00247e3c29
Optimized GEMMs for MX FP4/8 ( #2294 )
...
Adds V3 GEMM pipeline for MX FP4 and MX FP8
Adds V3 GEMM pipeline for MX FP4 with preshuffling
Adds MXFP4 GEMM tests (#2275 )
Adds MXFP4 GEMM examples
Adds MXFP4 GEMMs to ckProfiler
Co-authored-by: Andriy Roshchenko <107577548+andriy-ca@users.noreply.github.com >
Co-authored-by: Andriy Roshchenko <andriy.roshchenko@amd.com >
Co-authored-by: aska-0096 <haocwang@amd.com >
Co-authored-by: lalala-sh <Jiaxing.Wen@amd.com >
Co-authored-by: OscarXu <huaiguxu@amd.com >
Co-authored-by: mtgu0705 <mtgu@amd.com >
Co-authored-by: Ding, Yi <yi.ding@amd.com >
Co-authored-by: feifei14119 <feiw@amd.com >
Co-authored-by: Lin, Qun <qlin@amd.com >
Co-authored-by: joye <joye@amd.com >
Co-authored-by: Rostyslav Geyyer <46627076+geyyer@users.noreply.github.com >
2025-06-05 13:54:15 -06:00
Illia Silin
8146e471f1
fix the buffer intrinsic names for clang >=20 ( #2228 )
2025-05-23 14:58:25 -07:00
Illia Silin
1b846143c6
Revert "Update the buffer load/store intrinsic names for clang>=20. ( #2192 )" ( #2227 )
...
This reverts commit 58f9e9ffbc .
2025-05-22 15:41:17 -07:00
Illia Silin
58f9e9ffbc
Update the buffer load/store intrinsic names for clang>=20. ( #2192 )
...
* fix the buffer load/store intrinsic names
* fix clang format
2025-05-13 10:18:14 -07:00
Illia Silin
a88bf76ecc
Replace buffer load/store intrinsics with builtins ( #1876 )
...
* replace buffer load/store intrinsics with builtins
* fix clang format
* replace buffer load/store intrinsics with built-ins in ck_tile
* fix clang format
* add switch between buffer intrinsics and built-ins
* change the builtins threshold to clang20
* fix clang format
* fix some compilation errors
* revert changes in ck_tile
* revert changes in ck_tile
* delete all root files and folders when CI completes
* try changing the username in CI
* fix groovy syntax
* add user and group id info to ci dockers
* change ownership of all files in CI to jenkins at the end
* update changelog
2025-03-05 14:33:28 -08:00