Commit Graph

8 Commits

Author SHA1 Message Date
Illia Silin
1f4d13b2b5 Split the instances by architecture. (#1223)
* parse examples inside the add_example_executable function

* fix the example 64 cmake file

* add xdl flag to the gemm_bias_softmax_gemm_permute example

* add filtering of tests based on architecture type

* enable test_grouped_gemm for gfx9 only

* enable test_transpose only for gfx9

* only linnk test_transpose if it gets built

* split the gemm instances by architectures

* split gemm_bilinear,grouped_conv_bwd_weight instances by targets

* split instances by architecture

* split grouped_conv instances by architecture

* fix clang format

* fix the if-else logic in group_conv headers

* small fix for grouped convolution instances

* fix the grouped conv bwd weight dl instances

* fix client examples

* only enable client examples 3 and 4 on gfx9

* set the gfx9 macro

* make sure the architecture macros are set by cmake

* use separate set of xdl/wmma flags for host code

* sinmplify the main cmake file

* add conv_fwd_bf8 instance declaration

[ROCm/composable_kernel commit: ae57e5938e]
2024-04-02 09:42:17 -07:00
amoskvic
4256edcdd2 Style improvement: improving type alias usage consistency in gemm-related client examples. Also copyright year update for all client examples. (#1180)
Co-authored-by: Arseny Moskvichev <amoskvic@amd.com>

[ROCm/composable_kernel commit: a776978cbe]
2024-02-28 16:39:03 -08:00
Illia Silin
6a5e94f475 Split the static library into several files. (#1044)
* spolit the static library into several

* update lib paths and fix client example

* do not use device_mha_operarions for client examples

* use appropriate libs to link to client examples

* remove the gpu/transpose path from the list

* try fixing clinet examples 3,4,9

* add necessary libs for client examples

* fix the layernorm client example

* fix the client examples 23 and 24

* fix typo

* add interface library and refresh clang format

[ROCm/composable_kernel commit: 7965d66a81]
2023-11-28 11:17:37 -08:00
Illia Silin
b57fbee2f1 update copyright headers (#726)
[ROCm/composable_kernel commit: b94fd0b227]
2023-05-31 18:46:57 -05:00
zjing14
696991c923 add fp64 instances (#658)
Co-authored-by: root <root@ctr-ubbsmc15.amd.com>

[ROCm/composable_kernel commit: fde6d2742b]
2023-03-30 13:30:43 -05:00
ltqin
26767954fd Add client API/examples for 3xGemm+Bias+Add+Permute{0, 2, 3, 1} (#550)
* add example

* fix example

* add instance for gemm permute

* add to client example

* change configs

* change instance file name

* formate

* change client example file name and remove example

[ROCm/composable_kernel commit: 55236709e2]
2023-01-18 10:52:52 -06:00
Po Yen Chen
e418b29268 Introduce ck::accumulate_n() (#439)
We can use this template to eliminate duplicated iterator computing
logics. By providing return type to ck::accumulate_n(), we can avoid
type conversion operations.

[ROCm/composable_kernel commit: 730204eed0]
2022-11-14 19:53:39 -06:00
Chao Liu
7a98e9fa34 N-D Tensor Contraction example, instance, and client example (#270)
* adding contraction

* add contraction example

* update examle

* update example

* format

* update readme

* clean header

* clean header

* contraction with multiple D

* rename

* fix naming issue; add instances for contraction+bilinear

* change assumed virtual layout of contraction; add client example

* update example

* update

* contraction+scale

* use type_convert

* rename

[ROCm/composable_kernel commit: 4fe9c393b8]
2022-07-07 14:31:11 -05:00