Commit Graph

12 Commits

Author SHA1 Message Date
Chao Liu
4a27f120ea Absolute include path (#281)
* ad gelu and fast_gelu

* added GeLU and fast GeLU

* clean up

* add gemm+fastgelu example

* add gemm+gelu instances

* update profiler

* clean up

* clean up

* adding gemm+bias+activation

* clean

* adding bias

* clean

* adding gemm multiple d

* debugging

* add gemm bias add fastgelu

* rename, clean

* refactoring; add readme

* refactor

* refactor

* refactor

* refactor

* refactor

* refactor

* fix

* fix

* update example

* update example

* rename

* update example

* add ckProfiler

* clean

* clean

* clean

* clean

* add client app example

* update readme

* delete obselete files

* remove old client app

* delete old file

* cleaning

* clean

* remove half

* fix header path

* fix header path

* fix header path

* fix header path

* fix header path

* fix header path for all examples

* fix header path

* fix header path

* fix header path

* fix header path

* fix header path

* fix header path

* fix header path

* fix header path

* fix header path

* revert client app example

* clean build

* fix build

* temporary disable client test on Jenkins

* clean

* clean

* clean

[ROCm/composable_kernel commit: d1db6a0c3e]
2022-06-24 20:51:04 -05:00
ltqin
0f407c565e Add FP64 XDL GEMM built-in function (#199)
* add intrin_mfma_f64_16x16x4f64

* add example

* gemm reference add double data type

* chang init data

* fix M N PerXdlops

* fix ifdef

* add comparsion config

* add conv fwd example

* format log out

* change rc matrix egister layout

* reorganize example

* reorganize example 2

* format,because merge develop

* fix call impl adding acc data type

* lost ;

* add compiler warning

* change example tunning parameters

* add test for fp64

* add instance

* add test/gemm/gemm_fp64.cpp

* fix get name issue

* remove some tunning parameter

* fix conflict

* format

* use integer value for GEMM test

* add acc data type

* remove typeid because fp16

* fix streamconfig etc bug from merging develop

* format

* remove test_gemm_xdl_fp64

* add AccDataType

* AccDataType problem

Co-authored-by: qinletao <letaoqin@amd.com>
Co-authored-by: Chao Liu <chao.liu2@amd.com>

[ROCm/composable_kernel commit: 3e6c2610ae]
2022-05-26 14:48:57 -05:00
Jianfeng Yan
81f473ff41 Navi21 gemm (#197)
* start adding navi21 GEMM

* navi_gemm_km_kn_mn_fp32 compiles and passes one test.

* rename variables and functions in gridwise_gemm_dlops_v1r3

* add other 3 layouts; format instance

* adding more tuning parameters

add tuning parameters for other 3 layouts

* add gemm_dlops_f16

* tmp

* add dependence of DeviceGemm::IsSupportedArg() on arch

* minor changes

* minor changes

* minor changes

* minor changes

* minor changes

* minor changes

* minor changes

* push gemm_dlops into profiler

* minor changes

* if using xdl or dlops is moved into profiler_gemm_impl

* minor changes

* minor changes

* remove is_xdl from profile_gemm_impl

* make IsSupportedArg dependent on arch for other device_gemm

* minor changes

* minor changes

* fix a bug in f_generate_tensor_value

* add 64x64x64 for gemm_dlops_int8

* add 64x64x64 for gemm_dlops_int8

* comment out 3 layouts in gemm_dlops_int8; add 32x32x32 for gemm_dlops_int8; init A values to 1

* fix

* start fixing tuning parameters

* monir

* minor changes

* minor changes

* minor changes

* fixing

* adding example

* adding example

* adding example

* add gemm fp32 example

* clean up

* use 128x128x16 as MNK tile in navi21 gemm example

* bug fix

* fix test

* use new block c tile

* clean

* fix build

Co-authored-by: Chao Liu <chao.liu2@amd.com>
Co-authored-by: shaojiewang <wsjmessi@163.com>

[ROCm/composable_kernel commit: 40b59a63cc]
2022-05-24 12:19:27 -05:00
JD
69d5f78b16 Add host API (#220)
* Add host API

* manually rebase on develop

* clean

* manually rebase on develop

* exclude tests from all target

* address review comments

* update client app name

* fix missing lib name

* clang-format update

* refactor

* refactor

* refactor

* refactor

* refactor

* fix test issue

* refactor

* refactor

* refactor

* upate cmake and readme

Co-authored-by: Chao Liu <chao.liu2@amd.com>

[ROCm/composable_kernel commit: cec69bc3bc]
2022-05-12 09:21:01 -05:00
Chao Liu
960b7b7c17 Code refactor (#175)
* format

* improving pipeline

* fix typo

* format

* adding thread group

* adding thread group

* adding thread group

* adding gemm pipeline

* tweak

* refactor

* refactor

* add missing type convert

* refactor

* refactor

* refactor

* clean

* fix build

* refactor

* format

* clean up

* use remove_cvref_t

* clean

* clean up

* clean up

* clean up

[ROCm/composable_kernel commit: ec7c2e912e]
2022-05-09 14:57:59 -05:00
Anthony Chang
52677b8896 profiler: fix fp32 c-shuffle gemm tuning parameter (#194)
[ROCm/composable_kernel commit: 7c0b149811]
2022-04-22 15:48:51 -05:00
Chao Liu
5a09f1e6e8 fix build (#171)
[ROCm/composable_kernel commit: 646878162b]
2022-03-31 20:30:20 -05:00
Anthony Chang
525db6a8da Tune & add conflict-free LDS gemm kernels (#159)
* retune & add conflict-free bf16/fp16 c-shuffle gemm instances

amend wrong K1 value in some fp16/bf16 kernel instances

* make gemm cshuffle's timing behavior consistent with all other functions

* clang-format

* retune & add conflict-free fp32 c-shuffle gemm instances

* retune & add conflict-free int8 c-shuffle gemm instances

* update the underlying gridwise gemm of all c-shuffle gemm kernels

* typo

[ROCm/composable_kernel commit: 7db48f9008]
2022-03-31 12:58:41 -05:00
Chao Liu
68b05b66f5 Compile for gfx908 and gfx90a (#130)
* adding compilation for multiple targets

* fix build

* clean

* update Jekinsfile

* update readme

* update Jenkins

* use ck::half_t instead of ushort for bf16

* rename enum classes

* clean

* rename

* clean

[ROCm/composable_kernel commit: cd167e492a]
2022-03-31 12:33:34 -05:00
Chao Liu
40b3b3c47e clean (#143)
[ROCm/composable_kernel commit: 2206136628]
2022-03-22 21:55:03 -05:00
rocking5566
4a7d8df85a Gemm_c_shuffle (4 layouts) X (fp32 bf16 int8) (#131)
* [What] Separate fixpoint gemm from gemm example
[Why] let example of gemm_int8 be pure gemm.
[What]
1. Add gemm_requant_relu_requant,
2. Let CDataType be int32 in pure gemm, because no one use int8 CDataType. It is also part of gemm_requant_relu_requant

* Fix path

* Revise cmakelist due to merge develop

* Add gemm fp16 test

* Extract PrepareGemmTensor

* Extract TestGemm

* Add test for different layout

* Add 4 layouts of shuffle version of fp32

* Add 4 layouts of shuffle version of int8

* Add 4 layouts of shuffle version of bf16

* replace all DeviceGemmPtr_ with DeviceGemmNoOpPtr to fit naming convension

* Add test for non-shuffle verstion of gemm

* Fix typo

* Print kernel information

* Add rest of the fp32 kernel to the test

* 1. Add rest of the fp16 device iop.
2. Mark the invalid device operation

Co-authored-by: rocking <chunylai@amd.com>

[ROCm/composable_kernel commit: 485ea46a40]
2022-03-21 15:59:51 -05:00
Chao Liu
6203866064 Reorganize files, Part 1 (#119)
* delete obselete files

* move files

* build

* update cmake

* update cmake

* fix build

* reorg examples

* update cmake for example and test

[ROCm/composable_kernel commit: 5d37d7bff4]
2022-03-08 21:46:36 -06:00