Files
composable_kernel/include/ck/utility
Haocong WANG 3049b5467c [GEMM] gemm_universal related optimization (#1453)
* replace buffer_atomic with global_atomic

* fixed global_atomic_add

* added bf16 atomic_add

* format

* clang-format-12

* clean

* clean

* add guards

* Update gtest.cmake

* enabled splitk_gemm_multi_d

* format

* add ckProfiler

* format

* fixed naming

* format

* clean

* clean

* add guards

* fix clang format

* format

* add kbatch printout

* clean

* Add rocm6.2 related gemm optimization

* Limit bf16 atomic usage

* remove redundant RCR gemm_universal instance

* Add RRR fp8 gemm universal instance

* Bug fix

* Add GPU_TARGET guard to FP8/BF8 target

* bug fix

* update cmake

* remove all fp8/bf8 example if arch not support

* Enable fp8 RRR support in ckProfiler

* limit greedy-reverse flag to gemm_universal in ckProfiler

---------

Co-authored-by: Jing Zhang <jizhan@fb.com>
Co-authored-by: Jing Zhang <jizhan@meta.com>
Co-authored-by: zjing14 <zhangjing14@gmail.com>
Co-authored-by: Illia Silin <98187287+illsilin@users.noreply.github.com>
Co-authored-by: illsilin <Illia.Silin@amd.com>
2024-08-14 10:42:30 +08:00
..
2024-03-08 17:11:51 -08:00
2024-05-10 09:41:39 -07:00
2024-06-25 16:37:35 -05:00
2023-05-31 18:46:57 -05:00
2023-05-31 18:46:57 -05:00
2023-05-31 18:46:57 -05:00
2023-05-31 18:46:57 -05:00
2023-05-31 18:46:57 -05:00
2023-05-31 18:46:57 -05:00
2023-09-26 21:16:23 -05:00