Commit Graph

1289 Commits

Author SHA1 Message Date
Bartłomiej Kocot
933951ed48 Fix continous dim selection in contraction (#1336)
* Fix continous dim selection in contraction

* Fixes
2024-06-18 10:26:49 +02:00
carlushuang
17ed368f58 [CK_TILE][FA] using pk f16_f32 (#1343)
* [CK_TILE][FA] using pk f16_f32

* correct a error
2024-06-17 17:16:46 +08:00
zjing14
e02103168a disabled lds direct load inline asm (#1331) 2024-06-16 20:33:47 -05:00
Bartłomiej Kocot
dc1e9c5df9 Support large tensors in grouped conv fwd (#1332)
* Support large tensors in grouped conv fwd

* Multi ABD fixes

* Fix calculate element space size
2024-06-14 09:53:03 -05:00
Qianfeng
37a347e380 Fix to the using of static_for in amd_buffer_addressing.hpp (#1337)
* Add insert_dummy_dep_per_dword over-loading for length 64

* Fix insert_dummy_dep_per_dword and remove over-loading for length 64

* Remove blank lines

---------

Co-authored-by: Po Yen Chen <PoYen.Chen@amd.com>
2024-06-13 16:12:20 +08:00
Rostyslav Geyyer
acda4c5a3c Add instances for grouped conv fwd 3d with ConvScale for fp8@bf8->fp8 (#1325)
* Add fp8 bf8 conv example

* Add instances

* Add client example

* Add random scale values

* Format
2024-06-12 14:41:56 -05:00
Bartłomiej Kocot
5fc1bee4c5 Fix nhwgc f16 wmma instances (#1328) 2024-06-11 09:52:38 +02:00
Rostyslav Geyyer
ce66277a76 Add a convinvscale op, related instances and examples (#1307)
* Update the element op

* Add an example

* Add instances

* Add a client example

* make sure new instances only build on gfx9

* Update element op and its handling

* Format

* Update instances to take element op as an argument

* Update examples to use random scale values

* Format

* Update client example with random scales

* Format

---------

Co-authored-by: illsilin <Illia.Silin@amd.com>
2024-06-10 14:48:49 -05:00
dependabot[bot]
8f5690c4bb Bump rocm-docs-core from 1.3.0 to 1.4.0 in /docs/sphinx (#1327)
Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core) from 1.3.0 to 1.4.0.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.3.0...v1.4.0)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-06 22:38:26 -07:00
Bartłomiej Kocot
ac58cc5d1d Integrate universal gemm with conv forward (#1320)
* Integrate universal gemm with conv fwd

* Fix conv fwd wmma test

* Fix instances

* Remove direct load check
2024-06-05 13:01:29 -05:00
dependabot[bot]
ba82beb9bf Bump rocm-docs-core from 1.2.1 to 1.3.0 in /docs/sphinx (#1324)
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 1.2.1 to 1.3.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v1.2.1...v1.3.0)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-05 07:36:39 -07:00
Rostyslav Geyyer
cb0645bedc Add a scale op, related instances and examples (#1242)
* Add a scale op

* Update the element op

* Add instances

* Add an example

* Add a client example

* Add a flag check

* Revert flag check addition

* Fix flag check

* Update d strides in example

* Update d strides in client example

* Apply suggestions from code review

Update copyright header

Co-authored-by: Bartłomiej Kocot <barkocot@amd.com>

* Move the example

* Move the client example

* Update element op

* Update example with the new element op

* Add scalar layout

* Update example

* Update kernel for scalar Ds

* Revert kernel changes

* Update element op

* Update example to use scales' pointers

* Format

* Update instances

* Update client example

* Move element op to unary elements

* Update element op to work with values instead of pointers

* Update instances to take element op as an argument

* Update examples to use random scale values

---------

Co-authored-by: Bartłomiej Kocot <barkocot@amd.com>
2024-06-04 19:28:15 -05:00
Dan Yao
2cab8d39e3 CK Tile FA Training kernels (#1286)
* FA fwd dropout

* FA bwd

* epilogue reuse

* CMakeLists update

* [CK_TILE] support alibi (#1269)

* add alibi support

* fix code

* update code based on comment

* Support more hdim

* fix fp8 bias

* support seqlen_k=0 case

* remove unused printf

* fix format

---------

Co-authored-by: rocking <ChunYu.Lai@amd.com>

* now fwd/bwd can build

* bwd alibi

* add bwd validation stream_config

* update generated filenames

* update bwd kernel launch

* CK_TILE_HOST_DEVICE in philox

* Transpose -> transpose

* format

* format

* format

* Generate the instance for FA required

* format

* fix error in WarpGemm

---------

Co-authored-by: danyao12 <danyao12>
Co-authored-by: carlushuang <carlus.huang@amd.com>
Co-authored-by: rocking <ChunYu.Lai@amd.com>
Co-authored-by: Po Yen Chen <PoYen.Chen@amd.com>
Co-authored-by: Jing Zhang <jizhan@amd.com>
2024-06-04 13:12:45 -05:00
dependabot[bot]
76827d82ca Bump rocm-docs-core from 1.2.0 to 1.2.1 in /docs/sphinx (#1322)
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 1.2.0 to 1.2.1.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v1.2.0...v1.2.1)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-03 22:41:56 -07:00
Illia Silin
3fa7e2a6c4 disable the hipTensor test by default, only run once daily (#1321) 2024-06-03 14:07:30 -07:00
zjing14
6fb1f4e03f Post-merge fix of PR 1300 (#1313)
* add f8 gemm with multiD for both row/col wise

* change compute_type to fp8

* changed tuning parameters in the example

* add rcr example

* post-merge fix

* fix

* reduce init range
2024-05-31 22:46:41 -07:00
Illia Silin
34f3dfdd61 Build CK library for all supported targets. (#1312)
* test library build for all supported targets

* increase the number of threads to build lib in CI to 64
2024-05-28 12:36:06 -07:00
dependabot[bot]
66de8a02ba Bump rocm-docs-core from 1.1.3 to 1.2.0 in /docs/sphinx (#1311)
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 1.1.3 to 1.2.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v1.1.3...v1.2.0)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-28 11:36:09 -07:00
zjing14
80db62f08d add f8 gemm multiD with both row/col wise scale (#1300)
* add f8 gemm with multiD for both row/col wise

* change compute_type to fp8

* changed tuning parameters in the example

* add rcr example
2024-05-28 12:04:22 -05:00
carlushuang
5055b3bdcb [CK_TILE] support group from cmdline (#1295)
* support cmdline seqlen decode

* silent print

* update readme

* update kernel launch 3d

* update tile partitioner

* fix spill for bf16

* modify based on comment

* modify payload_t

* fix bug for alibi mode

* fix alibi test err

* refactor kernel launch, support select timer

* add missing file

* remove useless code

* add some comments
2024-05-28 11:13:21 +08:00
Joseph Macaranas
02fa2c298b Enable external CI pipeline triggers (#1310) 2024-05-23 18:21:34 -04:00
Illia Silin
ec2bae27ff Split the gemm_multi_abd instances. (#1306)
* split the gemm_multi_abd instances

* update the dates
2024-05-23 09:17:02 -07:00
dependabot[bot]
06a9b72caf Bump rocm-docs-core from 1.1.2 to 1.1.3 in /docs/sphinx (#1308)
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 1.1.2 to 1.1.3.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v1.1.2...v1.1.3)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-23 07:45:53 -07:00
Max Podkorytov
29e58d5b28 Make the library which generates CK instances for pytorch2 inductor's CK backend usage
Also bundle the CK library and include files with the pip package.

The package is pip-installable with
`pip install
git+https://github.com/tenpercent/composable_kernel@enable-pip`

(substitute the repo path and branch if necessary)

Testing:

`myenv/bin/python3 -m ck4inductor.universal_gemm.gen_instances`

(prints a list of instances)

`tree myenv/lib/python3.12/site-packages/ck4inductor`

(observe the list of sources along the installed package)
2024-05-22 13:44:22 -07:00
Bartłomiej Kocot
fd72380aeb Optimize grouped conv bwd weight for small M and N (#1303)
* Optimize grouped conv bwd weight for small M and N

* Fixes
2024-05-22 21:01:01 +02:00
Illia Silin
7b027d5643 Select appropriate GPU targets for instances, tests, and examples. (#1304)
* set individual gpu targets for instances, examples, tests

* fix path to hip compiler

* fix path to hip compiler once more

* aggregate device macros in ck_tile config header

* fix the cmake logic for instances

* fix clang format

* add gfx900 and gfx906 to default set of targets
2024-05-22 11:45:27 -07:00
Rostyslav Geyyer
204da9c522 Move grouped conv fwd client examples (#1299)
* Move grouped conv fwd client examples

* Update existing examples

* Format
2024-05-21 09:52:41 -05:00
Illia Silin
06b891c5c2 aggregate device macros in ck_tile config header (#1297) 2024-05-20 08:34:45 -07:00
Illia Silin
1274861a9d replace the ENV macro with CK_ENV (#1296) 2024-05-17 10:42:51 -07:00
dependabot[bot]
6637a810d0 Bump rocm-docs-core from 1.1.1 to 1.1.2 in /docs/sphinx (#1293)
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 1.1.1 to 1.1.2.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v1.1.1...v1.1.2)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-17 07:44:48 -07:00
rocking
aaa8dfdae9 Fix compile error (#1292)
error: no viable conversion from returned value of type '__half' to function return type 'fp16_hip_t' (aka '_Float16')

Co-authored-by: carlushuang <carlus.huang@amd.com>
2024-05-17 17:19:17 +08:00
Illia Silin
c44137838e remove wrong use of nonexistent class members (#1290) 2024-05-15 08:08:17 -07:00
carlushuang
dd0dd13d4e remove operator-deref (#1291) 2024-05-15 08:06:50 -07:00
jakpiase
3e3471d5d2 Add unit tests for grouped gemm two stage (#1256)
* add unit tests for grouped gemm two stage

* add reviewers suggestions

---------

Co-authored-by: Adam Osewski <19374865+aosewski@users.noreply.github.com>
2024-05-15 10:03:39 +02:00
Illia Silin
7843a8a7fb re-enable convnd_fwd_xdl_fp64 testing (#1289) 2024-05-10 22:48:28 -07:00
Illia Silin
566b6480a2 Code clean-up (#1285)
* code clean-up

* remove the profiling output samples
2024-05-10 09:41:39 -07:00
carlushuang
fcba889ef4 [CK_TILE] fix some rand number init (#1287)
* add random norm

* normalized default to 0/3

* change squant->auto
2024-05-10 09:03:39 -07:00
Bartłomiej Kocot
8346af9c68 Change output gemm type to AccDataType in two stage conv bwd wei (#1283) 2024-05-10 10:57:42 +02:00
Adam Osewski
a0ae1c6133 Fix MakeArgument (#1284) 2024-05-09 09:42:41 -07:00
Adam Osewski
3c043cd10b Add vector instruction coherency bits for gfx94 targets. (#1268) 2024-05-09 07:30:17 -07:00
Illia Silin
fdbf8ccbd7 fix the output formatting (#1282) 2024-05-08 16:11:54 -07:00
Bartłomiej Kocot
0b6b5d1785 Add two stage grouped conv bwd weight kernel (#1280) 2024-05-08 09:53:24 +02:00
Illia Silin
bf42097646 Enable logging in CK with environment variable. (#1278)
* enable logging using environment variable

* update ck.hpp header

* fix typo

* fix clang format

* Update include/ck/utility/env.hpp

Co-authored-by: Bartłomiej Kocot <barkocot@amd.com>

---------

Co-authored-by: Bartłomiej Kocot <barkocot@amd.com>
2024-05-07 16:26:43 -07:00
carlushuang
851c3ed157 [CK_TILE] support alibi (#1269)
* add alibi support

* fix code

* update code based on comment

* Support more hdim

* fix fp8 bias

* support seqlen_k=0 case

* remove unused printf

* fix format

---------

Co-authored-by: rocking <ChunYu.Lai@amd.com>
2024-05-07 22:32:54 +08:00
Sam Wu
6d073d31bb Add ROCm Doc team as codeowners for RTD yaml (#1277)
Also add component owners as codeowners for header directory
2024-05-06 10:07:39 -06:00
Illia Silin
08d51d9bc4 add missing vector header (#1275) 2024-05-02 11:27:59 -07:00
Illia Silin
7797f7c7a1 Downgrade minimum required python version to 3.6 (#1274) 2024-05-01 15:34:56 -07:00
Illia Silin
f0bf1e3125 [CI] Focus CI stages on MI200 nodes for resource optimization (#1273) 2024-05-01 10:07:14 -07:00
Rostyslav Geyyer
a2d0bdd5a9 Add an ignore (#1270) 2024-04-30 20:45:22 -07:00
Sam Wu
43579900a9 Update documentation requirements and configurations (#1272)
* Update documentation requirements

Set rocm-docs-core to v1.1.1

* Update RTD config

Set Python 3.10 for rocm-docs-core >= v1.0.0
2024-04-30 20:44:59 -07:00