Commit Graph

10 Commits

Author SHA1 Message Date
linqunAMD
ba922fdf80 Extend XDL kernel to Support RDNA3/4 - Part 4 (#2724)
* Fix example

* fix build error

* update pk_i4 & moe test case

* fix all instance build (examples)

* fix batched_gemm_gemm (example)

* disable example_gemm_bias_softmax_gemm_permute on gfx11

* remove unnecessary disable gfx11

* update tests

* update tests2

[ROCm/composable_kernel commit: 321627aec5]
2025-09-12 08:17:07 -07:00
Illia Silin
f559597c46 Split the instances by architecture. (#1223)
* parse examples inside the add_example_executable function

* fix the example 64 cmake file

* add xdl flag to the gemm_bias_softmax_gemm_permute example

* add filtering of tests based on architecture type

* enable test_grouped_gemm for gfx9 only

* enable test_transpose only for gfx9

* only linnk test_transpose if it gets built

* split the gemm instances by architectures

* split gemm_bilinear,grouped_conv_bwd_weight instances by targets

* split instances by architecture

* split grouped_conv instances by architecture

* fix clang format

* fix the if-else logic in group_conv headers

* small fix for grouped convolution instances

* fix the grouped conv bwd weight dl instances

* fix client examples

* only enable client examples 3 and 4 on gfx9

* set the gfx9 macro

* make sure the architecture macros are set by cmake

* use separate set of xdl/wmma flags for host code

* sinmplify the main cmake file

* add conv_fwd_bf8 instance declaration

[ROCm/composable_kernel commit: ae57e5938e]
2024-04-02 09:42:17 -07:00
Bartłomiej Kocot
fa0b543b5e Add optimized blockwise gemm using ck wrapper (#1157)
* Add optimized blockwise gemm using ck wrapper

* Add basic gemm example

* Update docs

* Add tutorial for gemm using ck wrapper

* Add perf note

* edits

* Fix cmake

* Fixes

---------

Co-authored-by: Lisa Delaney <lisa.delaney@amd.com>

[ROCm/composable_kernel commit: 1e73adbc28]
2024-02-13 17:04:36 +01:00
Bartłomiej Kocot
459e8e2596 Extend gemm traits number for ck wrapper (#1153)
[ROCm/composable_kernel commit: 171ca260b5]
2024-02-02 11:25:54 -08:00
Bartłomiej Kocot
91e5ff9ce7 Add blockwise gemm to ck wrapper (#1139)
* Add blockwise gemm to ck wrapper

* Add blockwise gemm traits

* Disable test_gemm for non xdl devices

* Fixes

* Add c layout descritpions

[ROCm/composable_kernel commit: f3b6c23ac5]
2024-01-31 21:24:40 +01:00
Bartłomiej Kocot
660bfadafd Add optimized copy to ck wrapper (#1126)
* Add optimized copy to ck wrapper

* Example optimizations

* Fixes

* Move img2col test to client example

* Refactor example

* Fix docs

* Fixes

* Fix

* Fixes

* Fixes

* Fixes

* Fixes

* Fixes

---------

Co-authored-by: zjing14 <zhangjing14@gmail.com>

[ROCm/composable_kernel commit: 7e4eb4b800]
2024-01-19 11:29:00 +01:00
Bartłomiej Kocot
5a2a9efca7 Add tensor partition and generic copy for ck wrapper (#1108)
* Add tensor partition and generic copy for ck wrapper

* Update changelog

* Stylistic fixes

* Change shape/strides logic to descriptor transforms

* Fixes

* Fix client example

* Fix comments

[ROCm/composable_kernel commit: 4234b3a691]
2024-01-03 01:10:57 +01:00
Bartłomiej Kocot
6e776f21d9 Fix results verify in test_tensor (#1109)
[ROCm/composable_kernel commit: 20b1ae7ced]
2023-12-23 22:12:49 +01:00
Bartłomiej Kocot
29122919de Add tensor structure to wrapper (#1098)
* Add tensor structure to wrapper

* update changelog

* Fix names

* Comment fixes

[ROCm/composable_kernel commit: 07092d68f0]
2023-12-15 12:45:08 +01:00
Bartłomiej Kocot
6e7ca15cfc Introduce wrapper library (#1071)
* Introduce wrapper library

* Update cmake files

* Revert "Update cmake files"

This reverts commit c27f88b565.

* Fix comments

[ROCm/composable_kernel commit: 836b7e557d]
2023-12-06 11:58:59 +01:00