Aviral Goel
91ffc9dd1e
chore(copyright): update copyright header for example directory ( #3273 )
...
* chore(copyright): update copyright header for codegen directory
* chore(copyright): update copyright header for example directory
[ROCm/composable_kernel commit: d85f065b15 ]
2025-11-24 18:02:41 -08:00
Michał Kulikowski
ac4ecdacc5
[CK][Examples] Extending support for rdna3/4 in following examples: ( #2884 )
...
* [CK][Examples] Extending support for rdna3/4 in following examples:
-example_gemm_xdl_splitk_reduce_multi_d_fp16
-example_gemm_xdl_splitk_reduce_multi_d_bf16
-example_gemm_xdl_splitk_reduce_bf16A_i8B
-example_gemm_xdl_splitk_reduce_bfp16
-example_splitk_gemm_bias_e_permute_xdl_fp32
-example_gemm_add_multiply_xdl_fp16
-example_complex_contraction_bilinear_xdl_fp32
-example_grouped_gemm_lower_triangle_scale_softmax_gemm_permute_xdl_fp16
-example_batched_gemm_bias_e_permute_xdl_fp16
-example_gemm_xdl_fp16
-example_gemm_xdl_fp16_av2
-example_gemm_xdl_wavelet_fp16
-example_gemm_add_add_fastgelu_xdl_bf16
-example_gemm_add_add_fastgelu_xdl_fp16
-example_gemm_add_add_fastgelu_xdl_fp32
-example_grouped_gemm_xdl_fp32
-example_grouped_gemm_xdl_fp16
-example_grouped_gemm_xdl_bf16
-example_cgemm_xdl_bf16
-example_cgemm_xdl_fp16
Signed-off-by: Michal Kulikowski <Michal.Kulikowski@amd.com >
* [CK][Examples] Extending support for rdna3/4 in following examples:
-example_gemm_xdl_splitk_reduce_multi_d_fp16
-example_gemm_xdl_splitk_reduce_multi_d_bf16
-example_gemm_xdl_splitk_reduce_bf16A_i8B
-example_gemm_xdl_splitk_reduce_bfp16
-example_splitk_gemm_bias_e_permute_xdl_fp32
-example_gemm_add_multiply_xdl_fp16
-example_complex_contraction_bilinear_xdl_fp32
-example_grouped_gemm_lower_triangle_scale_softmax_gemm_permute_xdl_fp16
-example_batched_gemm_bias_e_permute_xdl_fp16
-example_gemm_xdl_fp16
-example_gemm_xdl_fp16_av2
-example_gemm_xdl_wavelet_fp16
-example_gemm_add_add_fastgelu_xdl_bf16
-example_gemm_add_add_fastgelu_xdl_fp16
-example_gemm_add_add_fastgelu_xdl_fp32
-example_grouped_gemm_xdl_fp32
-example_grouped_gemm_xdl_fp16
-example_grouped_gemm_xdl_bf16
-example_cgemm_xdl_bf16
-example_cgemm_xdl_fp16
Signed-off-by: Michal Kulikowski <Michal.Kulikowski@amd.com >
---------
Signed-off-by: Michal Kulikowski <Michal.Kulikowski@amd.com >
[ROCm/composable_kernel commit: 2b684f0a7d ]
2025-09-29 09:05:04 -07:00
Illia Silin
b57fbee2f1
update copyright headers ( #726 )
...
[ROCm/composable_kernel commit: b94fd0b227 ]
2023-05-31 18:46:57 -05:00
Adam Osewski
c747be612f
Refactor device op implementations into impl subdirectory. ( #420 )
...
* Move kernel implementation files under impl directory.
* Update examples paths.
* Update device kernel impl include paths.
* Update tensor operation instances include paths.
* Update profiler and tests include paths.
* Clang-format
* Update include paths for batched gemm reduce
* Refactor UnitTest ConvNDBwdWeight.
* Refactor fwd and bwd data convND UT.
* Fix used test macro.
* Fix include path.
* Fix include paths.
* Fix include paths in profiler and tests.
* Fix include paths.
Co-authored-by: Adam Osewski <aosewski@amd.com >
[ROCm/composable_kernel commit: 3048028897 ]
2022-10-13 09:05:08 -05:00
Adam Osewski
6a02665d94
GEMM batched/splitK/cgemm/grouped int4 examples ( #383 )
...
* Grouped GEmm int4.
* Formatting + fix K dimension for int8.
* Batched Gemm int4 example.
* CGEMM int4 example.
* Include inc filese in clang-format.
* SplitK int4 example
* Refactoring of performance measurement.
* Fix #ifdef statements.
Co-authored-by: Adam Osewski <aosewski@amd.com >
[ROCm/composable_kernel commit: 3ab20fd753 ]
2022-08-25 17:19:15 -05:00
Adam Osewski
4fb078cc12
CGEMM examples bf16, fp32, int8 ( #332 )
...
* Add int8 specialization for elementwise Add and Subtract.
* CGEMM examples bf16, fp32, int8
* Add convert reference output to CDataType.
* Skip BF16 data type during testing.
* Lower K value to get rid of accumulation error.
* Fix merge artifact.
* Fix changed function name: GetElementSpaceSize()
* Fix merge artifact.
Co-authored-by: Adam Osewski <aosewski@amd.com >
[ROCm/composable_kernel commit: fb0dc35861 ]
2022-08-02 14:52:27 -05:00