Commit Graph

442 Commits

Author SHA1 Message Date
Jun Liu
3fc733cd35 [Composable Kernel] update develop branch code to ck_upstream
Merge pull request #1236 from ROCmSoftwarePlatform/develop

[ROCm/composable_kernel commit: 5890e30076]
2021-10-25 19:49:17 -07:00
Qianfeng
f3d8052ad2 [Enhancements] Several bugfixes and refactoring of dynamic generic reduction (#1156)
* Squashed 'src/composable_kernel/' content from commit a4b211238

git-subtree-dir: src/composable_kernel
git-subtree-split: a4b21123849265d90a6b8fa86905a9a8ab253787

* add solver ConvIgemmFwdV6r1DlopsNchwKcyxNkhw; rename static ck source files

* Squashed 'src/composable_kernel/' changes from a4b211238..5805b5dc4

5805b5dc4 Update develop (#5) (#6)
ede23b251 Merge pull request #4 from ROCmSoftwarePlatform/separate_online_compile
8b079b5c6 refactor
c3d788bfa refactor
fcf913481 rename

git-subtree-dir: src/composable_kernel
git-subtree-split: 5805b5dc442dd8d71295954c4a755a6ef30593bb

* fix

* refactor

* remove online compilation from CK

* refactor

* fix

* add ctest

* tidy

* add tidy

* tidy

* tidy

* tidy

* tidy

* tidy

* tidy

* tidy

* tidy

* tidy

* add c-style pointer cast

* vector/scalar pointer cast use c-style pointer cast instead of reinterpret_cast

* fix clang warning suppression

* tidy

* suppress cppcheck

* fix enum issue

* revert chagnes to hip build

* fix kernel filename

* update CK build script

* rename

* rename

* make innner product compatiable on gfx900

* Update src/include/miopen/solver/ck_utility_common.hpp

Co-authored-by: JD <Jehandad.Khan@amd.com>

* compiler parameter use stream

* use int instead of index_t in kernel wrapper

* DynamicBuffer, StaticBuffer, amd_buffer_load support customized value for invalid element

* refactor

* refactor

* change cmakelist

* change ck common utility

* fix

* Squashed 'src/composable_kernel/' changes from 5805b5dc4..dd3d4444e

dd3d4444e Merge pull request #16 from ROCmSoftwarePlatform/develop
cb6b2dc63 Merge pull request #14 from ROCmSoftwarePlatform/miopen_downstream_init_integration
d9b2fcab4 Merge pull request #8 from ROCmSoftwarePlatform/miopen_downstream_init_integration
57b74196a refactor
431c47bea refactor
9a0d05870 DynamicBuffer, StaticBuffer, amd_buffer_load support customized value for invalid element
bc4146402 use int instead of index_t in kernel wrapper
87a2fc094 compiler parameter use stream
24743c85e make innner product compatiable on gfx900
7ad33d8e1 rename
5a3bace8d rename
12405c12a update CK build script
3c2effd43 fix kernel filename
12ff8d1ca fix enum issue
f0f97fd79 tidy
26f311aa9 fix clang warning suppression
c4f47ed09 vector/scalar pointer cast use c-style pointer cast instead of reinterpret_cast
35fd7bf79 add c-style pointer cast
9c31642f0 tidy
1a2efac60 tidy
ddd3b4e94 tidy
7daa0cfbf tidy
cab6e58d3 tidy
d9a8aebd8 tidy
533e356ce tidy
42639836b tidy
efe2836a2 add tidy
d53d7c666 fix
cf4ea1145 remove online compilation from CK
e63b17bdf refactor
5a2e56f78 Merge commit '437cc595c6e206dfebb118985b5171bbc1e29eab' into composable_kernel_init_integration_v3
702078bfd Merge pull request #7 from ROCmSoftwarePlatform/master
9ce85357a Update develop (#5)
10a172710 add solver ConvIgemmFwdV6r1DlopsNchwKcyxNkhw; rename static ck source files

git-subtree-dir: src/composable_kernel
git-subtree-split: dd3d4444e9b9ed07a54f82d91d969770aa8d5074

* Tiny fix in using data type template parameters in blockwise and direct_threadwise kernel

* Fix with regard to implementing GetZeroVal() in both kernel and host

* Avoid convert to compType from dstDataType before writting the output value

* Add half_t support to NumericLimits and make constexpr GetZeroVal() of binary operator

* Add CONSTANT decorator for descriptor read buffer

* Use get_thread_local_1d_id() for thread local Id

* Rename GetZeroVal() to GetReductionZeroVal() in the kernels

* Remove constexpr from initialized zeroVal and tiny fix in reduction_operator.hpp

* Occasional tiny simplification and update in the kernel files

* Update in src/reducetensor.cpp for consistent IDs passing to the kernel

* Update to re-order tensor dimensions on the host, split second_call kernel wrapper files and simplify reduce_all kernel wrappers

* Update to remove OpenCL tidy checking failures

* Small updates in src/reducetensor.cpp

* Update for better readability

* Remove unused codes and not-needed template parameters in the kernel wrappers

Co-authored-by: Chao Liu <chao.liu2@amd.com>
Co-authored-by: JD <Jehandad.Khan@amd.com>

[ROCm/composable_kernel commit: dfb80c4e39]
2021-09-29 08:12:11 -07:00
Jun Liu
bd1af9250d Merge pull request #1165 from ROCmSoftwarePlatform/develop
Merge develop into CK_upstream (Please don't squash when merging)

[ROCm/composable_kernel commit: 8557901d02]
2021-09-21 15:52:12 -07:00
Qianfeng
b315c39b11 [SWDEV-281541][MSRCHA-100] Implementation of Dynamic Generic Reduction (#1108)
* add solver ConvIgemmFwdV6r1DlopsNchwKcyxNkhw; rename static ck source files

* make inner product compatible on gfx900

* Update src/include/miopen/solver/ck_utility_common.hpp

* compiler parameter use stream

* use int instead of index_t in kernel wrapper

* DynamicBuffer, StaticBuffer, amd_buffer_load support customized value for invalid element

* Add dynamic generic reduction kernel layer (kernel wrappers, kernel implementations and utilities)

* Some updates to dynamic composable kernel facility for the need of dynamic generic reduction

* Update to generic reduction C++ host interface layer to support dynamic generic reduction

* Update to remove tidy complaints in host interface layer

* Change the unary operator form from void op(T &x) to T op(T x)

* Update to pass single workspace pointer for all kernels (fix for OpenCL backend)

* Use cppcheck-suppress to prevent some strange warnings

* Re-use operator [] and () for DynamicBuffer and update to depending codes

* Remove useless codes in first call threadwise/warpwise/blockwise kernel wrappers

* [performance] Remove un-needed local buffer initialization

Co-authored-by: Chao Liu <chao.liu2@amd.com>
Co-authored-by: JD <Jehandad.Khan@amd.com>

[ROCm/composable_kernel commit: 9e80cdceb7]
2021-08-26 18:04:55 -07:00
Chao Liu
ee428d2d6f Composable kernel init integration v3 (#1097)
* Squashed 'src/composable_kernel/' content from commit f6edda611

git-subtree-dir: src/composable_kernel
git-subtree-split: f6edda6119

* add solver ConvIgemmFwdV6r1DlopsNchwKcyxNkhw; rename static ck source files

* Squashed 'src/composable_kernel/' changes from f6edda611..5781adf5c

5781adf5c Update develop (#5) (#6)
97e6d514f Merge pull request #4 from ROCmSoftwarePlatform/separate_online_compile
7b1ec41e5 refactor
49c33aaea refactor
54b3e73d1 rename

git-subtree-dir: src/composable_kernel
git-subtree-split: 5781adf5cf

* fix

* refactor

* remove online compilation from CK

* refactor

* fix

* add ctest

* add c-style pointer cast

* vector/scalar pointer cast use c-style pointer cast instead of reinterpret_cast

* fix clang warning suppression

* tidy

* suppress cppcheck

* fix enum issue

* revert chagnes to hip build

* fix kernel filename

* update CK build script

* rename

* rename

* make innner product compatiable on gfx900

* Update src/include/miopen/solver/ck_utility_common.hpp

Co-authored-by: JD <Jehandad.Khan@amd.com>

* compiler parameter use stream

* use int instead of index_t in kernel wrapper

* DynamicBuffer, StaticBuffer, amd_buffer_load support customized value for invalid element

* refactor

* refactor

* change cmakelist

* change ck common utility

* fix

Co-authored-by: JD <Jehandad.Khan@amd.com>

[ROCm/composable_kernel commit: 6fe3627a9e]
2021-08-19 10:55:03 -05:00
Chao Liu
dd3e30e224 Merge pull request #14 from ROCmSoftwarePlatform/miopen_downstream_init_integration
MIOpen Downstream: Initial integration 2nd PR

[ROCm/composable_kernel commit: b62bf8c3f8]
2021-08-16 16:39:40 -05:00
Chao Liu
a364868bcf Merge pull request #8 from ROCmSoftwarePlatform/miopen_downstream_init_integration
[ROCm/composable_kernel commit: ccc4a1d365]
2021-08-16 16:28:53 -05:00
Chao Liu
bf335705ef refactor
[ROCm/composable_kernel commit: 67ad47e7c1]
2021-08-16 21:01:33 +00:00
Chao Liu
04d90a65c5 refactor
[ROCm/composable_kernel commit: 16effa767c]
2021-08-16 20:36:47 +00:00
Chao Liu
a222af4530 DynamicBuffer, StaticBuffer, amd_buffer_load support customized value for invalid element
[ROCm/composable_kernel commit: a91b68dfcb]
2021-08-13 23:40:19 +00:00
Chao Liu
e145aa0e29 use int instead of index_t in kernel wrapper
[ROCm/composable_kernel commit: 2cbabbba54]
2021-08-13 20:55:39 +00:00
Chao Liu
d49b0e5239 compiler parameter use stream
[ROCm/composable_kernel commit: 0834bc7635]
2021-08-13 01:05:14 +00:00
Chao Liu
c5ea28dc43 make innner product compatiable on gfx900
[ROCm/composable_kernel commit: f2ac7832c6]
2021-08-11 09:42:53 -05:00
Chao Liu
0bbcd05e4e rename
[ROCm/composable_kernel commit: 4e57b30a6a]
2021-08-11 00:08:42 +00:00
Chao Liu
c5a4edb9e8 rename
[ROCm/composable_kernel commit: c03045ce2d]
2021-08-10 23:45:36 +00:00
Chao Liu
400efa2b88 update CK build script
[ROCm/composable_kernel commit: b2589957f3]
2021-08-10 22:19:13 +00:00
Chao Liu
a0df0eb029 fix kernel filename
[ROCm/composable_kernel commit: 2c48039d0e]
2021-08-10 22:15:23 +00:00
Chao Liu
7d53faec4e fix enum issue
[ROCm/composable_kernel commit: d626dccc95]
2021-08-10 20:55:13 +00:00
Chao Liu
2e7587d6db tidy
[ROCm/composable_kernel commit: 643ebd4f3e]
2021-08-10 07:07:11 +00:00
Chao Liu
a15f7c025f fix clang warning suppression
[ROCm/composable_kernel commit: ddd49ec9e7]
2021-08-10 06:20:24 +00:00
Chao Liu
ab465fca4c vector/scalar pointer cast use c-style pointer cast instead of reinterpret_cast
[ROCm/composable_kernel commit: 4f566c6221]
2021-08-10 05:55:20 +00:00
Chao Liu
17b084ce2c add c-style pointer cast
[ROCm/composable_kernel commit: 172036d728]
2021-08-10 00:01:52 -05:00
Chao Liu
72c6d11a3b tidy
[ROCm/composable_kernel commit: 76f3131939]
2021-08-09 18:49:59 -05:00
Chao Liu
7b306ba01c tidy
[ROCm/composable_kernel commit: d18428901e]
2021-08-09 18:20:02 -05:00
Chao Liu
eb778cb91b tidy
[ROCm/composable_kernel commit: f885c131d8]
2021-08-09 22:13:47 +00:00
Chao Liu
922b72948d tidy
[ROCm/composable_kernel commit: 80120f0a0c]
2021-08-09 21:10:09 +00:00
Chao Liu
a913d48b07 tidy
[ROCm/composable_kernel commit: c3efeb5e20]
2021-08-09 19:32:07 +00:00
Chao Liu
590dde14c8 tidy
[ROCm/composable_kernel commit: 56fc0842b3]
2021-08-09 19:27:49 +00:00
Chao Liu
7885261dc6 tidy
[ROCm/composable_kernel commit: 54fba515b3]
2021-08-09 17:33:32 +00:00
Chao Liu
193f7cbbcf tidy
[ROCm/composable_kernel commit: e62bae7a4a]
2021-08-09 15:11:35 +00:00
Chao Liu
71a28fc501 add tidy
[ROCm/composable_kernel commit: 24c8728942]
2021-08-08 17:41:54 +00:00
Chao Liu
c2922caeb1 fix
[ROCm/composable_kernel commit: 61487e0a00]
2021-08-07 02:31:19 +00:00
Chao Liu
b0ab2054a9 remove online compilation from CK
[ROCm/composable_kernel commit: ae98b52ad8]
2021-08-07 00:51:05 +00:00
Chao Liu
23239fa618 refactor
[ROCm/composable_kernel commit: cb95421311]
2021-08-06 22:17:51 +00:00
Chao Liu
3bbe4dae99 Merge commit '437cc595c6e206dfebb118985b5171bbc1e29eab' into composable_kernel_init_integration_v3
[ROCm/composable_kernel commit: 73ca970155]
2021-08-06 21:32:27 +00:00
Chao Liu
51d5010798 Merge pull request #7 from ROCmSoftwarePlatform/master
Master

[ROCm/composable_kernel commit: 3b86646115]
2021-08-06 16:29:58 -05:00
Chao Liu
98d5803d26 Update develop (#5)
* refactor

[ROCm/composable_kernel commit: d09ea4f4e5]
2021-08-06 16:11:15 -05:00
Chao Liu
37a9932cfc refactor
[ROCm/composable_kernel commit: 7b1ec41e5b]
2021-08-06 20:50:01 +00:00
Chao Liu
101d8ecf21 refactor
[ROCm/composable_kernel commit: 49c33aaea7]
2021-08-06 19:59:53 +00:00
Chao Liu
53e5d36013 rename
[ROCm/composable_kernel commit: 54b3e73d17]
2021-08-06 18:07:15 +00:00
Chao Liu
55cdfe9695 add solver ConvIgemmFwdV6r1DlopsNchwKcyxNkhw; rename static ck source files
[ROCm/composable_kernel commit: 3d32ae9404]
2021-07-30 17:50:17 -05:00
Chao Liu
67d45b2ee6 update to clang-format-10
[ROCm/composable_kernel commit: 82fae390fb]
2021-07-30 16:37:00 -05:00
Chao Liu
ce7d015e51 Merge pull request #2 from asroy/master
Update readme

[ROCm/composable_kernel commit: bd27ed6c38]
2021-07-28 09:43:56 -05:00
Chao Liu
1b8b55fc61 Update README.md
[ROCm/composable_kernel commit: 85a1429301]
2021-07-28 09:41:38 -05:00
Chao Liu
18bec192f5 Update README.md
[ROCm/composable_kernel commit: 56f93c6f33]
2021-07-28 09:40:44 -05:00
Chao Liu
3bcdb7879d fix building
[ROCm/composable_kernel commit: 6a1bc5939c]
2021-07-27 13:12:43 -05:00
Chao Liu
e02d6a0f21 [MIOpen Downstream] Initial MIOpen integration (#52)
* update online kernel wrapper bundle all descriptors in a tuple

* change __CONSTANT__ to CONSTANT

* rename

* adding tuning

* added IsValidCompileParameter

* reorginze

* adding tunable for fp16 and int8

* fix kernel compile warning and bug fixes

* suppress warning about cast CONSTANT (address space 4) pointer

* fix building issue

[ROCm/composable_kernel commit: f63a23acb1]
2021-07-27 00:02:27 -05:00
Chao Liu
f94e566273 reorganize files to prepare for MIOpen integration (#51)
* change olc cmake

* adding online compile to fwd-v4r5r2

* update scripts

* remane fwd-v4r5r2 to fwd-v6r1

* clean up

[ROCm/composable_kernel commit: 1264925422]
2021-07-18 00:43:05 -05:00
zjing14
440e2126bb Add xdlops v4r4r4 into online compilation (#48)
* init for v4r4 xdlops olc

* refactor wrap

* init impl of v4r4 nchw xdlops olc

* tuning

* test perf

* fixed v4r4 nhwc

* tuned v4r4 nhwc

* use gridwise_gemm_xdlops_v2r3

* swap a/b

* add pointer support into offline v2r3

* debugging v4r4r4 transform for olc

* change timer of olc

* refactor v4r4 xdlops nchw olc

* remove transform fun in v4r4 xdlops nhwc olc

Co-authored-by: Chao Liu <chao.liu2@amd.com>

[ROCm/composable_kernel commit: fbdf4332c7]
2021-07-16 23:27:08 -05:00
Chao Liu
4c61ba83c6 Change initialization method of tensor for iGEMM (#49)
* change init method

[ROCm/composable_kernel commit: 0a72e4df94]
2021-07-16 22:55:01 -05:00