Commit Graph

8 Commits

Author SHA1 Message Date
Aviral Goel
bb41ea37e1 chore(copyright) update library wide CMakeLists.txt copyright header template (#3313)
* chore(copyright) update library wide CMakeLists.txt files copyright header template

* Fix build

---------

Co-authored-by: Sami Remes <samremes@amd.com>

[ROCm/composable_kernel commit: 004784ef98]
2025-11-28 13:49:54 -08:00
Aviral Goel
d171245c4b chore(copyright): update copyright header for test directory (#3265)
[ROCm/composable_kernel commit: f6c999bddb]
2025-11-22 19:38:27 -05:00
linqunAMD
20333fd850 [CK] Add command option instance_index and param_mask to run partial ck test (#2889)
* [CK] Add command option instance_index and param_mask to run partial ck test

Many CK test are instance test. it will loop all instance in the instance library. It causes test often out-of-time if we run test on simulator/emulator.
This PR add option instance_index and param_mask to reduce the workload of instance test

instance_index: only run test 1 available instance with specified index.
param_mask: filter the embedded parameter with specified mask

* fix CI error

* fix clang format

---------

Co-authored-by: illsilin_amdeng <Illia.Silin@amd.com>

[ROCm/composable_kernel commit: e78a897ec0]
2025-09-30 08:24:40 -07:00
linqunAMD
a9e6cb0ec0 Extend XDL kernel to Support RDNA3/4 - Part 5 (#2725)
* Enable xdl in gfx11 & gfx12

* update cmake file

* fix all instance build (cmake)

* fix batched_gemm_gemm(cmake)

* rebase cmake files

* fix cmake build error

* remve CK_ENABLE_DYNAMIC_WARP_SIZE

* update cmake build error2

* fix gfx11 build

CK_USE_XDL is enabled on gfx11 and gfx12

* fix gfx10 build

* fix gfx11 error

---------

Co-authored-by: Lin, Qun <Quentin.Lin+amdeng@amd.com>

[ROCm/composable_kernel commit: f22740df82]
2025-09-15 10:59:25 -07:00
Bartłomiej Kocot
b12eb7db4f Grouped Convolution Forward Infer Bias Bnorm Activ (#2621)
* Grouped Convolution Forward Infer Bias Bnorm Activ

* 3d

[ROCm/composable_kernel commit: 5328b232b2]
2025-08-07 08:36:47 +02:00
Bartłomiej Kocot
f25da17c36 Enable multiple D for grouped conv fwd large tensors (#2572)
[ROCm/composable_kernel commit: 5b244105d9]
2025-07-28 22:39:07 +02:00
Bartłomiej Kocot
b7f7728e82 Grouped conv bias clamp fp32/fp16 support (#2366)
[ROCm/composable_kernel commit: 663992e99b]
2025-06-20 11:41:04 +02:00
Bartłomiej Kocot
b5b0797513 Grouped convolution forward with clamp (#2334)
* Grouped convolution forward with clamp

* Optimize clamp

* unary fixes

* test gk bias

* Revert "test gk bias"

This reverts commit 8e42e29d7b.

* Revert "Revert "test gk bias""

This reverts commit e73c0550ce.

* workaround comment

[ROCm/composable_kernel commit: f6c2ff9dce]
2025-06-16 15:36:53 +02:00