mirror of
https://github.com/ROCm/composable_kernel.git
synced 2026-05-04 13:41:24 +00:00
Cleanups (#2631)
* Remove some duplicate code in fmha_fwd_appendkv_kernel.hpp * Simplify two templated operator calls by having the templated types deduced automatically * Simplify two GemmPipeline calls * Fix GemmPipelineAgBgCrCompV4::GetName * Refactor use of ArgParser in CK tile GEMM examples * Update args in README.md to match the implementation in create_args * Remove some unnecessary include statements * Rename two variables * Factor out common code * Factor out do_verify * Add and use type aliases for memory operation integral constants * In gemm_basic.cpp, use kPadM, kPadN, kPadK, and kBlockPerCu from GemmConfig --------- Co-authored-by: Adam Osewski <19374865+aosewski@users.noreply.github.com>
This commit is contained in:
@@ -18,7 +18,6 @@ This will result in an executable `build/bin/tile_example_gemm_basic` & `build/b
|
||||
## example
|
||||
```
|
||||
args:
|
||||
-b batch size (default:1)
|
||||
-m m dimension (default:1024)
|
||||
-n n dimension (default:2048)
|
||||
-k k dimension (default:64)
|
||||
@@ -29,9 +28,11 @@ args:
|
||||
-stride_b Tensor B stride (default:0)
|
||||
-stride_c Tensor C stride (default:0)
|
||||
-v 0. No validation, 1. Validation on CPU, 2. Validation on GPU (default:2)
|
||||
-e Absolute error tolerance (default:1e-5)
|
||||
-prec data type. fp16/bf16/fp8/bf8/int8 (default:fp16)
|
||||
-warmup number of iterations before benchmark the kernel (default:10)
|
||||
-repeat number of iterations to benchmark the kernel (default:100)
|
||||
-timer gpu:gpu timer, cpu:cpu timer (default:gpu)
|
||||
-split_k splitK value (default:1)
|
||||
-init 0:random, 1:linear, 2:constant (default:1)
|
||||
-persistent 0:non-persistent, 1:persistent (default:0)
|
||||
```
|
||||
|
||||
Reference in New Issue
Block a user