Commit Graph

1224 Commits

Author SHA1 Message Date
carlushuang
200d2b22d4 fix scratch in fp8 kernel 2024-03-25 19:45:38 +00:00
Po-Yen, Chen
1cacb713c5 Default use CK_TILE_FLOAT_TO_FP8_STOCHASTIC rounding mode 2024-03-23 22:51:18 -04:00
carlushuang
bb1f6e48eb fix fp8 duplicated move/shift/and/or problem 2024-03-19 23:29:57 +00:00
carlushuang
886d040a81 fix compile error, fp8 not ready now 2024-03-18 07:58:00 +00:00
carlushuang
f55c7629bc not using custom data type by default, now we can have ISA-level same code as opt_padding 2024-03-17 23:23:32 +00:00
carlushuang
ee397d0ab2 temp fix buffer_store spill 2024-03-15 22:56:41 +00:00
carlushuang
04762d212b make sure thread_buffer can be tuple/array 2024-03-13 22:03:42 +00:00
carlushuang
616932068d let more integral_constant->constant, and formating 2024-03-13 18:33:10 +00:00
Po-Yen, Chen
b1dbf64c91 Some minor changes 2024-03-13 03:55:07 -04:00
Po-Yen, Chen
8d1631adc9 Re-use function 2024-03-13 03:38:12 -04:00
Po-Yen, Chen
60221b89f8 Add constraint to array<> ctor 2024-03-13 03:32:05 -04:00
Po-Yen, Chen
5c433432fd Fix format 2024-03-13 03:21:30 -04:00
Po-Yen, Chen
958218e9d0 Rename enum
Rename 'cood_transform_enum' to 'coord_transform_enum'
2024-03-13 03:15:04 -04:00
carlushuang
d962a0044b fix compile issue in transpose 2024-03-13 15:02:45 +00:00
carlushuang
a59e655eb2 remove wrong code in store_raw() 2024-03-13 14:30:55 +00:00
Po-Yen, Chen
8103048b99 Merge branch 'ck_tile/refactor' of github.com:ROCm/composable_kernel-internal into ck_tile/refactor 2024-03-13 01:53:43 -04:00
Po-Yen, Chen
2b4e54305b Merge function templates 2024-03-13 01:52:49 -04:00
carlushuang
9f34bcb431 re-structure tuple/array to avoid spill 2024-03-11 15:32:21 +00:00
Po-Yen, Chen
0bd76de8a6 Update executable name in test scripts 2024-03-11 01:54:48 -04:00
Po-Yen, Chen
858e52e156 Update the required Python version to 3.9 2024-03-11 01:17:52 -04:00
carlushuang
26a25eb4cd unify as tuple_array 2024-03-06 18:36:45 +00:00
carlushuang
7df3947819 fix macro for exp2; fix warpgemm a/b in transposedC 2024-03-06 15:59:21 +00:00
carlushuang
0e7df1999f wip fix 2024-03-06 14:31:36 +00:00
carlushuang
f549bb5d39 minor fix 2024-03-04 21:11:53 +00:00
carlushuang
a83c181bb2 naming 2024-03-04 20:49:02 +00:00
carlushuang
a67473fff8 now can build 2024-03-04 20:45:51 +00:00
carlushuang
112d521b09 fix xx 2024-03-03 23:48:31 +00:00
carlushuang
fbd25cea35 fix build wip 2024-02-29 22:27:31 +00:00
carlushuang
f69356b1d7 add code 2024-02-28 22:57:19 +00:00
Illia Silin
e60c5aea4e Merge pull request #36 from ROCm/lwpck-1299
Initial MI350 enablement.
2024-02-15 09:20:20 -08:00
illsilin
63df00cdd7 disable examples 31 and 41 int8 on gfx950 2024-02-14 17:25:10 -08:00
illsilin
e60bf36c9e fix clang format 2024-02-14 16:16:38 -08:00
illsilin
d66da6bee9 initial enablement of gfx950 2024-02-14 15:33:50 -08:00
Illia Silin
29dcb956db Merge pull request #33 from ROCm/lwpck-1292
Merge from the public repo.
2024-02-08 12:32:07 -08:00
illsilin
cbcc844e93 merge from public repo 2024-02-08 12:24:02 -08:00
Lakhinder Walia
1f306024d0 fast_gelu: minor code reorg to enhance ref & gpu performance (#1162) 2024-02-07 19:24:51 -08:00
Illia Silin
1b0fbaebbb Split-up instances to improve build times. (#1159)
* split up splitk-gemm instances

* clean up some unused variables

* split the mk_kn_mn interwave splitk-gemm instances

* split up f16_f16_f16 mk_nk_mn splitk gemm instances

* fix clang format

* fix function names

* fix typo

* split up the 2 largest fp16*fp8 splitk gemm instances

* get rid of unused variables

* split up the largest splitk-gemm fp8*fp16 instance file

* split up the instances for xdl fp8 gemms

* split the headers for f16 and i8 for wmmma convolution instances
2024-02-07 12:47:12 -08:00
jakpiase
ba86eadce5 Add support for mixed-precision f16bf16_int8 gemm (#1127) 2024-02-07 15:54:13 +01:00
dependabot[bot]
753cef783f Bump rocm-docs-core from 0.33.1 to 0.33.2 in /docs/sphinx (#1160)
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.1 to 0.33.2.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.1...v0.33.2)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-02-06 21:24:32 -08:00
Bartlomiej Wroblewski
6951858221 Implement direct loads split-K GEMM kernel (#1137)
* WIP: Implement direct loads split-K GEMM kernel

* Clean the review

---------

Co-authored-by: Adam Osewski <19374865+aosewski@users.noreply.github.com>
Co-authored-by: Bartłomiej Kocot <barkocot@amd.com>
2024-02-07 01:08:34 +01:00
dependabot[bot]
6299621107 Bump rocm-docs-core from 0.33.0 to 0.33.1 in /docs/sphinx (#1158)
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.0 to 0.33.1.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.0...v0.33.1)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-02-06 09:24:34 -08:00
Illia Silin
f0dd1da088 Delete any dangling images after building a new one. (#1155)
* delete dangling docker images

* fix groovy syntax

* fix groovy syntax again

* try a different way to delete dangling images
2024-02-05 10:34:47 -08:00
Illia Silin
180f16f9ac Add support for more Navi2x and Navi3x models. (#1152)
* add support for navi2x and navi3x models

* fix syntax

* use common macro for different mi300 architectures
2024-02-02 11:35:26 -08:00
Bartłomiej Kocot
171ca260b5 Extend gemm traits number for ck wrapper (#1153) 2024-02-02 11:25:54 -08:00
Illia Silin
112b691bb7 add new performance tests for mixed fp16/fp8 gemms (#1151) 2024-01-31 13:27:17 -08:00
Bartłomiej Kocot
f3b6c23ac5 Add blockwise gemm to ck wrapper (#1139)
* Add blockwise gemm to ck wrapper

* Add blockwise gemm traits

* Disable test_gemm for non xdl devices

* Fixes

* Add c layout descritpions
2024-01-31 21:24:40 +01:00
Illia Silin
6651a124cc update the name of the compiler staging branch (#1148) 2024-01-30 13:55:31 -08:00
Illia Silin
e7495e6bb7 turn off performance tests in CI by default until the infrastructure is fixed (#1147) 2024-01-30 13:14:58 -08:00
dependabot[bot]
84832fc42d Bump rocm-docs-core from 0.31.0 to 0.33.0 in /docs/sphinx (#1144)
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.31.0 to 0.33.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.31.0...v0.33.0)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-01-29 09:02:52 -08:00
Illia Silin
4a8297c28a fix CK path for hipTensor (#1143) 2024-01-25 17:05:43 -08:00