Aviral Goel
bb41ea37e1
chore(copyright) update library wide CMakeLists.txt copyright header template ( #3313 )
...
* chore(copyright) update library wide CMakeLists.txt files copyright header template
* Fix build
---------
Co-authored-by: Sami Remes <samremes@amd.com >
[ROCm/composable_kernel commit: 004784ef98 ]
2025-11-28 13:49:54 -08:00
Aviral Goel
d171245c4b
chore(copyright): update copyright header for test directory ( #3265 )
...
[ROCm/composable_kernel commit: f6c999bddb ]
2025-11-22 19:38:27 -05:00
linqunAMD
20333fd850
[CK] Add command option instance_index and param_mask to run partial ck test ( #2889 )
...
* [CK] Add command option instance_index and param_mask to run partial ck test
Many CK test are instance test. it will loop all instance in the instance library. It causes test often out-of-time if we run test on simulator/emulator.
This PR add option instance_index and param_mask to reduce the workload of instance test
instance_index: only run test 1 available instance with specified index.
param_mask: filter the embedded parameter with specified mask
* fix CI error
* fix clang format
---------
Co-authored-by: illsilin_amdeng <Illia.Silin@amd.com >
[ROCm/composable_kernel commit: e78a897ec0 ]
2025-09-30 08:24:40 -07:00
linqunAMD
a9e6cb0ec0
Extend XDL kernel to Support RDNA3/4 - Part 5 ( #2725 )
...
* Enable xdl in gfx11 & gfx12
* update cmake file
* fix all instance build (cmake)
* fix batched_gemm_gemm(cmake)
* rebase cmake files
* fix cmake build error
* remve CK_ENABLE_DYNAMIC_WARP_SIZE
* update cmake build error2
* fix gfx11 build
CK_USE_XDL is enabled on gfx11 and gfx12
* fix gfx10 build
* fix gfx11 error
---------
Co-authored-by: Lin, Qun <Quentin.Lin+amdeng@amd.com >
[ROCm/composable_kernel commit: f22740df82 ]
2025-09-15 10:59:25 -07:00
Bartłomiej Kocot
b12eb7db4f
Grouped Convolution Forward Infer Bias Bnorm Activ ( #2621 )
...
* Grouped Convolution Forward Infer Bias Bnorm Activ
* 3d
[ROCm/composable_kernel commit: 5328b232b2 ]
2025-08-07 08:36:47 +02:00
Bartłomiej Kocot
f25da17c36
Enable multiple D for grouped conv fwd large tensors ( #2572 )
...
[ROCm/composable_kernel commit: 5b244105d9 ]
2025-07-28 22:39:07 +02:00
Bartłomiej Kocot
b7f7728e82
Grouped conv bias clamp fp32/fp16 support ( #2366 )
...
[ROCm/composable_kernel commit: 663992e99b ]
2025-06-20 11:41:04 +02:00
Bartłomiej Kocot
b5b0797513
Grouped convolution forward with clamp ( #2334 )
...
* Grouped convolution forward with clamp
* Optimize clamp
* unary fixes
* test gk bias
* Revert "test gk bias"
This reverts commit 8e42e29d7b .
* Revert "Revert "test gk bias""
This reverts commit e73c0550ce .
* workaround comment
[ROCm/composable_kernel commit: f6c2ff9dce ]
2025-06-16 15:36:53 +02:00