Commit Graph

1325 Commits

Author SHA1 Message Date
PoYen, Chen
0fd7f85504 Use shorter template parameter name 2024-06-11 14:20:03 +00:00
PoYen, Chen
138b75bf12 Remove unused include directive 2024-06-11 14:18:24 +00:00
PoYen, Chen
16cc9eeef4 Fix unstable clang-format comment 2024-06-11 14:15:52 +00:00
PoYen, Chen
c9bbb7b142 Clearn up generate.py 2024-06-11 14:15:07 +00:00
PoYen, Chen
bb6804e315 Add constness to local variables 2024-06-11 14:10:35 +00:00
PoYen, Chen
31505a2a04 Remove more debug statements 2024-06-11 14:08:39 +00:00
PoYen, Chen
5efb80347e Remove debug statements in example 2024-06-11 14:02:53 +00:00
PoYen, Chen
912a6cb2ea Remove in-consistent comment 2024-06-11 13:56:44 +00:00
PoYen, Chen
95be5c2b9d Remove no-longer used field 2024-06-11 13:46:13 +00:00
PoYen, Chen
893841d745 Undo vector size changes 2024-06-11 13:46:13 +00:00
PoYen, Chen
40c885f007 Fix wrong loop counter step logic 2024-06-11 13:46:13 +00:00
PoYen, Chen
c36cad2e6c Fix wrong LDS indexing logics 2024-06-11 13:46:13 +00:00
PoYen, Chen
d74a1d6ed1 Fix split-kv combine kernel name 2024-06-11 13:46:13 +00:00
PoYen, Chen
f3e213c0c5 Reduce # of combine kernels 2024-06-11 13:46:13 +00:00
PoYen, Chen
180b726f97 Fix wrong kBlockSize used in policy 2024-06-11 13:46:13 +00:00
PoYen, Chen
238fde80a6 Fix o_acc memory error 2024-06-11 13:46:13 +00:00
PoYen, Chen
ffd2768000 Format codes 2024-06-11 13:46:13 +00:00
PoYen, Chen
18a7223b96 Fix wrong layout of LSE/LSEacc/Oacc 2024-06-11 13:46:13 +00:00
PoYen, Chen
064afc69d9 Replace sentinel value before storing 2024-06-11 13:46:13 +00:00
PoYen, Chen
5a6b8d8606 Clean-up code 2024-06-11 13:46:13 +00:00
Po-Yen, Chen
eac0f3cc47 Fix mismatched return type 2024-06-11 13:46:13 +00:00
PoYen, Chen
9ac2654b55 Add SplitKV combine kernel codegen logics 2024-06-11 13:46:13 +00:00
PoYen, Chen
cacce74f2c Add SplitKV kernel codegen logics 2024-06-11 13:46:13 +00:00
PoYen, Chen
78b64d11c4 Generate fmha_fwd_splitkv() 2024-06-11 13:46:13 +00:00
PoYen, Chen
c928fefaae Add num_splits option and dummy split-kv api method 2024-06-11 13:46:13 +00:00
Po Yen Chen
abc7e7ed30 Merge branch 'develop' into ck_tile/fa_train 2024-06-04 16:03:01 +08:00
dependabot[bot]
76827d82ca Bump rocm-docs-core from 1.2.0 to 1.2.1 in /docs/sphinx (#1322)
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 1.2.0 to 1.2.1.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v1.2.0...v1.2.1)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-03 22:41:56 -07:00
danyao12
327074c3f8 fix error in WarpGemm 2024-06-04 11:42:33 +08:00
danyao12
bdd4a87199 format 2024-06-04 08:26:53 +08:00
Illia Silin
3fa7e2a6c4 disable the hipTensor test by default, only run once daily (#1321) 2024-06-03 14:07:30 -07:00
rocking
9ceff3a5c8 Generate the instance for FA required 2024-06-03 20:03:16 +00:00
zjing14
6fb1f4e03f Post-merge fix of PR 1300 (#1313)
* add f8 gemm with multiD for both row/col wise

* change compute_type to fp8

* changed tuning parameters in the example

* add rcr example

* post-merge fix

* fix

* reduce init range
2024-05-31 22:46:41 -07:00
root
c70662a92e format 2024-06-01 01:42:45 +00:00
Jing Zhang
09e9f10f97 format 2024-05-31 13:59:47 +00:00
root
60b328d597 Merge branch 'ck_tile/fa_train' of github.com:ROCm/composable_kernel into ck_tile/fa_train 2024-05-31 13:51:37 +00:00
Jing Zhang
0d7f71779b format 2024-05-31 13:51:28 +00:00
Po Yen Chen
ff31c6a70c Merge branch 'develop' into ck_tile/fa_train 2024-05-31 15:52:47 +08:00
danyao12
87f73f30e8 Transpose -> transpose 2024-05-29 16:54:26 +08:00
danyao12
58f61716b5 CK_TILE_HOST_DEVICE in philox 2024-05-29 16:20:34 +08:00
Illia Silin
34f3dfdd61 Build CK library for all supported targets. (#1312)
* test library build for all supported targets

* increase the number of threads to build lib in CI to 64
2024-05-28 12:36:06 -07:00
dependabot[bot]
66de8a02ba Bump rocm-docs-core from 1.1.3 to 1.2.0 in /docs/sphinx (#1311)
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 1.1.3 to 1.2.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v1.1.3...v1.2.0)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-28 11:36:09 -07:00
zjing14
80db62f08d add f8 gemm multiD with both row/col wise scale (#1300)
* add f8 gemm with multiD for both row/col wise

* change compute_type to fp8

* changed tuning parameters in the example

* add rcr example
2024-05-28 12:04:22 -05:00
danyao12
1c511b3e7d update bwd kernel launch 2024-05-28 23:14:18 +08:00
danyao12
ba6437868b Merge branch 'develop' into ck_tile/fa_train 2024-05-28 11:42:38 +08:00
carlushuang
5055b3bdcb [CK_TILE] support group from cmdline (#1295)
* support cmdline seqlen decode

* silent print

* update readme

* update kernel launch 3d

* update tile partitioner

* fix spill for bf16

* modify based on comment

* modify payload_t

* fix bug for alibi mode

* fix alibi test err

* refactor kernel launch, support select timer

* add missing file

* remove useless code

* add some comments
2024-05-28 11:13:21 +08:00
Joseph Macaranas
02fa2c298b Enable external CI pipeline triggers (#1310) 2024-05-23 18:21:34 -04:00
Illia Silin
ec2bae27ff Split the gemm_multi_abd instances. (#1306)
* split the gemm_multi_abd instances

* update the dates
2024-05-23 09:17:02 -07:00
dependabot[bot]
06a9b72caf Bump rocm-docs-core from 1.1.2 to 1.1.3 in /docs/sphinx (#1308)
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 1.1.2 to 1.1.3.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v1.1.2...v1.1.3)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-23 07:45:53 -07:00
danyao12
7ed2ca79ac update generated filenames 2024-05-23 17:20:10 +08:00
danyao12
ff6f33d4f7 add bwd validation stream_config 2024-05-23 15:18:43 +08:00