Files
composable_kernel/example/27_layernorm
Illia Silin f46a6ffad8 Fix the fp8 gemm for large tensors on MI300. (#1011)
* Fix the fp8 conversion

* Try clipping value before conversion

* Fix return

* Simplify with a const

* reduce the gemm input tensor values to reduce round-off error

* replace if-else with lambda

* fix syntax

---------

Co-authored-by: Rostyslav Geyyer <rosty.geyyer@amd.com>
2023-10-27 21:10:47 -07:00
..
2023-05-31 18:46:57 -05:00