Commit Graph

4 Commits

Author SHA1 Message Date
Haocong WANG
a0e0f3cdcc [GEMM] F8 GEMM, performance optimized. (#1384)
* add ab_scale init support

* enabled interwave

* add scale type; update isSupport

* adjust example

* clean

* enable f8 pure gemm rcr ckprofiler

* Add gemm_multiply_multiply instances

* clang format

* Optimize for ScaleBlockMNK=128

* enable abscale f8 gemm ck profiler

* Add pure f8 gemm test suite

* Reverting to the state of project at f60fd77

* update copyright

* clang format

* update copyright

---------

Co-authored-by: root <jizhan@amd.com>

[ROCm/composable_kernel commit: 8c90f25be3]
2024-07-19 22:06:52 +08:00
zjing14
b35d3ce2b5 add gemm_bias_add example (#1361)
* add gemm_bias_add example

* changed strideD

* clang-format

---------

Co-authored-by: Illia Silin <98187287+illsilin@users.noreply.github.com>

[ROCm/composable_kernel commit: 13c1e64daa]
2024-07-11 18:08:07 -07:00
zjing14
9227a76f8e Post-merge fix of PR 1300 (#1313)
* add f8 gemm with multiD for both row/col wise

* change compute_type to fp8

* changed tuning parameters in the example

* add rcr example

* post-merge fix

* fix

* reduce init range

[ROCm/composable_kernel commit: 6fb1f4e03f]
2024-05-31 22:46:41 -07:00
zjing14
96356d2daf add f8 gemm multiD with both row/col wise scale (#1300)
* add f8 gemm with multiD for both row/col wise

* change compute_type to fp8

* changed tuning parameters in the example

* add rcr example

[ROCm/composable_kernel commit: 80db62f08d]
2024-05-28 12:04:22 -05:00