Po-Yen, Chen
60221b89f8
Add constraint to array<> ctor
2024-03-13 03:32:05 -04:00
Po-Yen, Chen
5c433432fd
Fix format
2024-03-13 03:21:30 -04:00
Po-Yen, Chen
958218e9d0
Rename enum
...
Rename 'cood_transform_enum' to 'coord_transform_enum'
2024-03-13 03:15:04 -04:00
carlushuang
d962a0044b
fix compile issue in transpose
2024-03-13 15:02:45 +00:00
carlushuang
a59e655eb2
remove wrong code in store_raw()
2024-03-13 14:30:55 +00:00
Po-Yen, Chen
8103048b99
Merge branch 'ck_tile/refactor' of github.com:ROCm/composable_kernel-internal into ck_tile/refactor
2024-03-13 01:53:43 -04:00
Po-Yen, Chen
2b4e54305b
Merge function templates
2024-03-13 01:52:49 -04:00
carlushuang
9f34bcb431
re-structure tuple/array to avoid spill
2024-03-11 15:32:21 +00:00
Po-Yen, Chen
0bd76de8a6
Update executable name in test scripts
2024-03-11 01:54:48 -04:00
Po-Yen, Chen
858e52e156
Update the required Python version to 3.9
2024-03-11 01:17:52 -04:00
carlushuang
26a25eb4cd
unify as tuple_array
2024-03-06 18:36:45 +00:00
carlushuang
7df3947819
fix macro for exp2; fix warpgemm a/b in transposedC
2024-03-06 15:59:21 +00:00
carlushuang
0e7df1999f
wip fix
2024-03-06 14:31:36 +00:00
carlushuang
f549bb5d39
minor fix
2024-03-04 21:11:53 +00:00
carlushuang
a83c181bb2
naming
2024-03-04 20:49:02 +00:00
carlushuang
a67473fff8
now can build
2024-03-04 20:45:51 +00:00
carlushuang
112d521b09
fix xx
2024-03-03 23:48:31 +00:00
carlushuang
fbd25cea35
fix build wip
2024-02-29 22:27:31 +00:00
carlushuang
f69356b1d7
add code
2024-02-28 22:57:19 +00:00
Illia Silin
e60c5aea4e
Merge pull request #36 from ROCm/lwpck-1299
...
Initial MI350 enablement.
2024-02-15 09:20:20 -08:00
illsilin
63df00cdd7
disable examples 31 and 41 int8 on gfx950
2024-02-14 17:25:10 -08:00
illsilin
e60bf36c9e
fix clang format
2024-02-14 16:16:38 -08:00
illsilin
d66da6bee9
initial enablement of gfx950
2024-02-14 15:33:50 -08:00
Illia Silin
29dcb956db
Merge pull request #33 from ROCm/lwpck-1292
...
Merge from the public repo.
2024-02-08 12:32:07 -08:00
illsilin
cbcc844e93
merge from public repo
2024-02-08 12:24:02 -08:00
Lakhinder Walia
1f306024d0
fast_gelu: minor code reorg to enhance ref & gpu performance ( #1162 )
2024-02-07 19:24:51 -08:00
Illia Silin
1b0fbaebbb
Split-up instances to improve build times. ( #1159 )
...
* split up splitk-gemm instances
* clean up some unused variables
* split the mk_kn_mn interwave splitk-gemm instances
* split up f16_f16_f16 mk_nk_mn splitk gemm instances
* fix clang format
* fix function names
* fix typo
* split up the 2 largest fp16*fp8 splitk gemm instances
* get rid of unused variables
* split up the largest splitk-gemm fp8*fp16 instance file
* split up the instances for xdl fp8 gemms
* split the headers for f16 and i8 for wmmma convolution instances
2024-02-07 12:47:12 -08:00
jakpiase
ba86eadce5
Add support for mixed-precision f16bf16_int8 gemm ( #1127 )
2024-02-07 15:54:13 +01:00
dependabot[bot]
753cef783f
Bump rocm-docs-core from 0.33.1 to 0.33.2 in /docs/sphinx ( #1160 )
...
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core ) from 0.33.1 to 0.33.2.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases )
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md )
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.1...v0.33.2 )
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-02-06 21:24:32 -08:00
Bartlomiej Wroblewski
6951858221
Implement direct loads split-K GEMM kernel ( #1137 )
...
* WIP: Implement direct loads split-K GEMM kernel
* Clean the review
---------
Co-authored-by: Adam Osewski <19374865+aosewski@users.noreply.github.com >
Co-authored-by: Bartłomiej Kocot <barkocot@amd.com >
2024-02-07 01:08:34 +01:00
dependabot[bot]
6299621107
Bump rocm-docs-core from 0.33.0 to 0.33.1 in /docs/sphinx ( #1158 )
...
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core ) from 0.33.0 to 0.33.1.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases )
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md )
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.0...v0.33.1 )
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-02-06 09:24:34 -08:00
Illia Silin
f0dd1da088
Delete any dangling images after building a new one. ( #1155 )
...
* delete dangling docker images
* fix groovy syntax
* fix groovy syntax again
* try a different way to delete dangling images
2024-02-05 10:34:47 -08:00
Illia Silin
180f16f9ac
Add support for more Navi2x and Navi3x models. ( #1152 )
...
* add support for navi2x and navi3x models
* fix syntax
* use common macro for different mi300 architectures
2024-02-02 11:35:26 -08:00
Bartłomiej Kocot
171ca260b5
Extend gemm traits number for ck wrapper ( #1153 )
2024-02-02 11:25:54 -08:00
Illia Silin
112b691bb7
add new performance tests for mixed fp16/fp8 gemms ( #1151 )
2024-01-31 13:27:17 -08:00
Bartłomiej Kocot
f3b6c23ac5
Add blockwise gemm to ck wrapper ( #1139 )
...
* Add blockwise gemm to ck wrapper
* Add blockwise gemm traits
* Disable test_gemm for non xdl devices
* Fixes
* Add c layout descritpions
2024-01-31 21:24:40 +01:00
Illia Silin
6651a124cc
update the name of the compiler staging branch ( #1148 )
2024-01-30 13:55:31 -08:00
Illia Silin
e7495e6bb7
turn off performance tests in CI by default until the infrastructure is fixed ( #1147 )
2024-01-30 13:14:58 -08:00
dependabot[bot]
84832fc42d
Bump rocm-docs-core from 0.31.0 to 0.33.0 in /docs/sphinx ( #1144 )
...
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core ) from 0.31.0 to 0.33.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases )
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md )
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.31.0...v0.33.0 )
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-01-29 09:02:52 -08:00
Illia Silin
4a8297c28a
fix CK path for hipTensor ( #1143 )
2024-01-25 17:05:43 -08:00
rocking
28f68a5a99
layernorm & groupnorm bwd gamma beta ( #1133 )
...
* Add layernorm bwd gamma beta external api
* Add groupnorm external api
* Add layernorm bwd gamma beta profiler
* Add groupnorm bwd gamma beta ckProfiler
* Add layernorm & groupnorm bwd gamma beta test
* Fix groupnorm bwd gamma beta profiler bug
* Layernorm bwd weight client example
* Groupnorm bwd weight client example
* clang format
* Remove useless header
* Let inv_std be positive
* Rename to num_bytes and move this calculation outside the loop
2024-01-25 19:53:15 +08:00
Illia Silin
180e572076
Fixing most of the cppcheck errors. ( #1142 )
...
* fix cppcheck errors, first pass
* fix format
* fix returned value in examples
* add macro definitions for cppcheck
* fix the profile_gemm logic
* update the gemm profiler logic
* add more difinitions to cppcheck, fix couple more errors
* replace runtime error with message in device function
* fix a couple of int4 issues
* no return for fill function
* fix errors in data_types.hpp
* fix format
* fix few remaining errors
* fix errors in data_types.hpp
* fix last couple of errors in datat_types.hpp
2024-01-24 13:47:48 -08:00
Bartłomiej Kocot
6169fbbdb3
Fix possible linting errors in changelog ( #1141 )
...
* Fix possible linting errors in changelog
* Update CHANGELOG.md
* Update CHANGELOG.md
* Update CHANGELOG.md
2024-01-24 17:19:02 +01:00
zjing14
1be4706366
fixed return ( #1138 )
2024-01-22 08:42:26 -08:00
Haocong WANG
bb63b9732c
[GEMM] Optimization for MI200/300. ( #1135 )
...
* Optimize GEMM on MI200/300:
1. Add new blockwise gemm pipeline
2. Add irregular splitk intances
* clang format + typo fix
* Fix a bug
2024-01-19 07:02:22 -06:00
Bartłomiej Kocot
7e4eb4b800
Add optimized copy to ck wrapper ( #1126 )
...
* Add optimized copy to ck wrapper
* Example optimizations
* Fixes
* Move img2col test to client example
* Refactor example
* Fix docs
* Fixes
* Fix
* Fixes
* Fixes
* Fixes
* Fixes
* Fixes
---------
Co-authored-by: zjing14 <zhangjing14@gmail.com >
2024-01-19 11:29:00 +01:00
Illia Silin
38882d8ab5
add Adam to code owners ( #1136 )
2024-01-18 19:20:40 -06:00
randyh62
402a930a4a
Randyh docfix ( #1130 )
...
* Update LICENSE
update to 2024
* Update index.rst
change license.md to license.html
* fix syntax
---------
Co-authored-by: illsilin <Illia.Silin@amd.com >
2024-01-16 09:00:37 -08:00
Illia Silin
c1b5b58192
add code owners ( #1132 )
2024-01-16 07:55:18 -08:00
Illia Silin
e6d099c830
Add cppcheck to CK CI. ( #1125 )
...
* add cppcheck to the CK CI
* fix the path to CK source for cppcheck
* fix the path to CK source for cppcheck one more time
* fix the path to CK source for cppcheck third time
* change the path to ck_cppcheck.log
* install latest cppcheck from source
* fix bug in ck.hpp and use 20 threads for cppcheck
* create a switch to turn cppckeck on and off in CI
2024-01-15 09:11:45 -08:00