Commit Graph

17 Commits

Author SHA1 Message Date
Aleksander Dudek
93c115574f CK Tile Batched Gemm 2024-10-29 20:54:29 +00:00
Aleksander Dudek
3c171550f6 Batched gemm - messy validation check 2024-10-29 15:58:02 +00:00
Aleksander Dudek
71eea17c5f Batched gemm - counting strides 2024-10-29 09:47:50 +00:00
Aleksander Dudek
6cb12c614c Add batched gemm kernel - initial stride parameters 2024-10-22 11:22:22 +00:00
Adam Osewski
4085e3d074 Merge branch 'develop' into aosewski/ck_tile_universal_gemm_p1 2024-10-10 13:33:09 +00:00
Thomas Ning
6f27bc9872 Ck tile gemm cshuffle & CK Tile GEMM restructure (#1535)
* ake the cshuffle compilable

* modify Mhe reference on gpu and cpu. Correaccess of cshuffle

* fix the cpu reference code

* Complete the in tile shuffle logic

* restructure the kernel template input

* change the naming pattern of ck_tile gemm pipeline

* Re-format files using remod.py

* Solve the fmha conflict with gemm

* Comment Addressed from Carlus

---------

Co-authored-by: Po Yen, Chen <PoYen.Chen@amd.com>
2024-10-10 18:02:22 +08:00
Adam Osewski
60bdc10c04 Formatting. 2024-10-10 09:49:04 +00:00
Adam Osewski
41fc6a2433 Few small changes & formatting. 2024-10-09 10:37:01 +00:00
Adam Osewski
4cf45f1b9c Add comment to load_tile_raw and change variable naming style. 2024-10-07 10:25:33 +00:00
Adam Osewski
4f18c2ded3 Get hot loop and TailNum information before lunching kernel. 2024-10-01 14:40:03 +00:00
Adam Osewski
8bd49370f1 Refactoring & Move Layout info to pipeline problem. 2024-10-01 14:38:19 +00:00
Adam Osewski
d3689b0686 Do not use ck_tile:: within ck_tile namespace. 2024-09-30 13:43:46 +00:00
Adam Osewski
0884043b5e Memory bound gemm pipeline. 2024-09-25 13:33:42 +00:00
Thomas Ning
694c300145 Ck tile gemm padding dim (#1516)
* Support the N dimension padding

* Finished the padding feature for different dimension of K
2024-09-18 11:32:29 -07:00
Thomas Ning
844f5a1712 Ck tile GPU verification sample develop & Add the CK TILE GEMM to the CI/CD test (#1505)
* Finished the feature of gpu verification

* Add the ck_tile_gemm test in the CI CD

* add the include of tensor_layou in reference_gemm

* Comment Addressed

* split ck_tile fhma and gemm tests into separate stages

* restructure the reference gemm

* restructure a new reference_gemm api that could read the device mem

---------

Co-authored-by: carlushuang <carlus.huang@amd.com>
Co-authored-by: illsilin <Illia.Silin@amd.com>
2024-09-14 21:08:40 +08:00
Dan Yao
d09572e8c2 [CK_TILE] FA bwd repair (#1502)
* fix fa bwd

* revert kernelBlockSize in gemm_kernel.hpp
2024-09-10 10:45:32 -07:00
Thomas Ning
caacd38830 Ck tile gemm example (#1488)
* Checkpoint: Finished with the tile example & kernel verification, working on the different matrix layout

* Finished the Matrix Layout feature set up. Note: Need to modify the inner block to solve the shuffle problem in the future.

* Fix: Clang Format, API fixed from fmha

* fix with better naming convention

* revert back the pipeline code of fmha

* Fixed: Addressed the comments and merge the GEMM shape of GEMM Operator and FMHA Operator to one.

* clang format with the reference_gemm file

* convert the clang format with the remod.py

* Changed the format and variable name of the kernel gemm_shape and partitioner

---------

Co-authored-by: thomasning <thomasning@banff-cyxtera-s70-4.ctr.dcgpu>
2024-09-07 16:23:32 +08:00