Commit Graph

4 Commits

Author SHA1 Message Date
Aviral Goel
d85f065b15 chore(copyright): update copyright header for example directory (#3273)
* chore(copyright): update copyright header for codegen directory

* chore(copyright): update copyright header for example directory
2025-11-24 18:02:41 -08:00
Michał Kulikowski
2b684f0a7d [CK][Examples] Extending support for rdna3/4 in following examples: (#2884)
* [CK][Examples] Extending support for rdna3/4 in following examples:
-example_gemm_xdl_splitk_reduce_multi_d_fp16
-example_gemm_xdl_splitk_reduce_multi_d_bf16
-example_gemm_xdl_splitk_reduce_bf16A_i8B
-example_gemm_xdl_splitk_reduce_bfp16
-example_splitk_gemm_bias_e_permute_xdl_fp32
-example_gemm_add_multiply_xdl_fp16
-example_complex_contraction_bilinear_xdl_fp32
-example_grouped_gemm_lower_triangle_scale_softmax_gemm_permute_xdl_fp16
-example_batched_gemm_bias_e_permute_xdl_fp16
-example_gemm_xdl_fp16
-example_gemm_xdl_fp16_av2
-example_gemm_xdl_wavelet_fp16
-example_gemm_add_add_fastgelu_xdl_bf16
-example_gemm_add_add_fastgelu_xdl_fp16
-example_gemm_add_add_fastgelu_xdl_fp32
-example_grouped_gemm_xdl_fp32
-example_grouped_gemm_xdl_fp16
-example_grouped_gemm_xdl_bf16
-example_cgemm_xdl_bf16
-example_cgemm_xdl_fp16

Signed-off-by: Michal Kulikowski <Michal.Kulikowski@amd.com>

* [CK][Examples] Extending support for rdna3/4 in following examples:
-example_gemm_xdl_splitk_reduce_multi_d_fp16
-example_gemm_xdl_splitk_reduce_multi_d_bf16
-example_gemm_xdl_splitk_reduce_bf16A_i8B
-example_gemm_xdl_splitk_reduce_bfp16
-example_splitk_gemm_bias_e_permute_xdl_fp32
-example_gemm_add_multiply_xdl_fp16
-example_complex_contraction_bilinear_xdl_fp32
-example_grouped_gemm_lower_triangle_scale_softmax_gemm_permute_xdl_fp16
-example_batched_gemm_bias_e_permute_xdl_fp16
-example_gemm_xdl_fp16
-example_gemm_xdl_fp16_av2
-example_gemm_xdl_wavelet_fp16
-example_gemm_add_add_fastgelu_xdl_bf16
-example_gemm_add_add_fastgelu_xdl_fp16
-example_gemm_add_add_fastgelu_xdl_fp32
-example_grouped_gemm_xdl_fp32
-example_grouped_gemm_xdl_fp16
-example_grouped_gemm_xdl_bf16
-example_cgemm_xdl_bf16
-example_cgemm_xdl_fp16

Signed-off-by: Michal Kulikowski <Michal.Kulikowski@amd.com>

---------

Signed-off-by: Michal Kulikowski <Michal.Kulikowski@amd.com>
2025-09-29 09:05:04 -07:00
Rostyslav Geyyer
aa932445ea Add a gpu gemm reference kernel (#1528)
* Add a gpu gemm reference kernel

* Switch to gpu reference in gemm examples

* Remove redundant arguments

* Update all related examples

* Update more examples

* Try less threads per block

* Try even less threads per block

* Add support for all matrix layouts

* Increase block size

* Clean up

* Remove hardcoded strides

* Clean up

* Try a column-major case

* Revert back to row-major

* Run both CPU and GPU veriffication

---------

Co-authored-by: Po Yen Chen <PoYen.Chen@amd.com>
2024-10-08 11:05:28 -05:00
Haocong WANG
bb63b9732c [GEMM] Optimization for MI200/300. (#1135)
* Optimize GEMM on MI200/300:
1. Add new blockwise gemm pipeline
2. Add irregular splitk intances

* clang format + typo fix

* Fix a bug
2024-01-19 07:02:22 -06:00