mirror of
https://github.com/ROCm/composable_kernel.git
synced 2026-04-20 14:59:17 +00:00
universal streamk fp8 changes (#1665)
* universal streamk fp8 changes & ckprofiler instances * revert strides to -1 and verification options * fp8 exclusion on pre-gfx94 for universal_streamk * PR review based revisions: permissions reverted, removed hip err checks --------- Co-authored-by: Illia Silin <98187287+illsilin@users.noreply.github.com>
This commit is contained in:
committed by
GitHub
parent
fb1ccfa9df
commit
d6d4c2788b
@@ -154,8 +154,7 @@ Additional cmake flags can be used to significantly speed-up the build:
|
||||
other platforms have faster instances, such as `xdl` or `wmma`, available.
|
||||
|
||||
* `CK_USE_FP8_ON_UNSUPPORTED_ARCH` (default is OFF) must be set to ON in order to build instances,
|
||||
such as `gemm_universal` and `gemm_multiply_multiply` for fp8 data type for GPU targets which do not
|
||||
have native support for fp8 data type, such as gfx908 or gfx90a. These instances are useful on
|
||||
such as `gemm_universal`, `gemm_universal_streamk` and `gemm_multiply_multiply` for fp8 data type for GPU targets which do not have native support for fp8 data type, such as gfx908 or gfx90a. These instances are useful on
|
||||
architectures like the MI100/MI200 for the functional support only.
|
||||
|
||||
## Using sccache for building
|
||||
|
||||
Reference in New Issue
Block a user