Files
composable_kernel/example/04_gemm_add_add_fastgelu
Michal Kulikowski 573df0d546 [CK][Examples] Extending support for rdna3/4 part 3:
-example_gemm_xdl_int8
-example_gemm_xdl_fp8
-example_gemm_xdl_fp8_bf8
-example_gemm_xdl_fp16_fp8
-example_gemm_add_add_fastgelu_xdl_int8
-example_grouped_gemm_xdl_int8
-example_grouped_conv_bwd_weight_xdl_bf16
-example_cgemm_xdl_fp32
-example_cgemm_xdl_int8

fixing cmdlines for:
-example_22_cgemm
-example_24_batched_gemm
-example_batched_gemm_xdl_fp16int4_b_scale_v3

Signed-off-by: Michal Kulikowski <Michal.Kulikowski@amd.com>


[ROCm/composable_kernel commit: 2444c44895]
2025-10-08 18:14:38 +02:00
..
2023-05-31 18:46:57 -05:00
2024-05-10 09:41:39 -07:00

Instructions for example_gemm_add_add_fastgelu_xdl_fp16

Run example_gemm_add_add_fastgelu_xdl_fp16

#arg1: verification (0=no, 1=yes)
#arg2: initialization (0=no init, 1=integer value, 2=decimal value)
#arg3: time kernel (0=no, 1=yes)
#arg4 to 11: M (256x), N(128x), K(32x), StrideA, StrideB, StrideD0, StrideD1, StrideE"
./bin/example_gemm_add_add_fastgelu_xdl_fp16 1 1 1