mirror of
https://github.com/ROCm/composable_kernel.git
synced 2026-05-14 10:09:41 +00:00
* add sparse attention VSA
* fix the pre-commit
* Add jenga test and pre-commit
* add bf16 for vsa
* add jenga support bf16
* remove lse arg
* split kernel code to block & kernel
* fix the pre-commit
* fix the pre-commit
* fix the copyrights
* fix the copyright
* fix the copyright & rename block to pipeline
* fix the copyright and pipeline
* remove lse & dropout & add fmt
* fix the jenga&VSA code review
* remove the useless code & resolved the comments
* remove useless code
* remove useless code
* Clean up code
* Remove more unused code
* Re-format .hpp
* Refactor codegen scripts
---------
Co-authored-by: Po Yen Chen <PoYen.Chen@amd.com>
Co-authored-by: asleepzzz <hanwen.chang@amd.com>
[ROCm/composable_kernel commit: 4d2f8c111e]
ck_tile/core
ck_tile/core contains every basic functions and structures to create a GPU kernel using ck_tile. User should only include ck_tile/core.hpp this single header to use all the functionality. Everything is under ck_tile namespace. The coding style under this folder should be similar to std (snake_case for structure/function, Camel for template types...)
algorithm/
coordinate transform and some other reusable algorithm
arch/
contains some basic device building block like mma, buffer addressing, etc...
container/
contains basic container data structure, array/sequence/tuple/...
numeric/
data type, and data type related math
tensor/
tensor descriptors and tile level API
utility/
other utility function for both host/device