Bartłomiej Kocot
933951ed48
Fix continous dim selection in contraction ( #1336 )
...
* Fix continous dim selection in contraction
* Fixes
2024-06-18 10:26:49 +02:00
carlushuang
17ed368f58
[CK_TILE][FA] using pk f16_f32 ( #1343 )
...
* [CK_TILE][FA] using pk f16_f32
* correct a error
2024-06-17 17:16:46 +08:00
zjing14
e02103168a
disabled lds direct load inline asm ( #1331 )
2024-06-16 20:33:47 -05:00
Bartłomiej Kocot
dc1e9c5df9
Support large tensors in grouped conv fwd ( #1332 )
...
* Support large tensors in grouped conv fwd
* Multi ABD fixes
* Fix calculate element space size
2024-06-14 09:53:03 -05:00
Qianfeng
37a347e380
Fix to the using of static_for in amd_buffer_addressing.hpp ( #1337 )
...
* Add insert_dummy_dep_per_dword over-loading for length 64
* Fix insert_dummy_dep_per_dword and remove over-loading for length 64
* Remove blank lines
---------
Co-authored-by: Po Yen Chen <PoYen.Chen@amd.com >
2024-06-13 16:12:20 +08:00
Rostyslav Geyyer
acda4c5a3c
Add instances for grouped conv fwd 3d with ConvScale for fp8@bf8->fp8 ( #1325 )
...
* Add fp8 bf8 conv example
* Add instances
* Add client example
* Add random scale values
* Format
2024-06-12 14:41:56 -05:00
Bartłomiej Kocot
5fc1bee4c5
Fix nhwgc f16 wmma instances ( #1328 )
2024-06-11 09:52:38 +02:00
Rostyslav Geyyer
ce66277a76
Add a convinvscale op, related instances and examples ( #1307 )
...
* Update the element op
* Add an example
* Add instances
* Add a client example
* make sure new instances only build on gfx9
* Update element op and its handling
* Format
* Update instances to take element op as an argument
* Update examples to use random scale values
* Format
* Update client example with random scales
* Format
---------
Co-authored-by: illsilin <Illia.Silin@amd.com >
2024-06-10 14:48:49 -05:00
dependabot[bot]
8f5690c4bb
Bump rocm-docs-core from 1.3.0 to 1.4.0 in /docs/sphinx ( #1327 )
...
Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core ) from 1.3.0 to 1.4.0.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases )
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md )
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.3.0...v1.4.0 )
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-06 22:38:26 -07:00
Bartłomiej Kocot
ac58cc5d1d
Integrate universal gemm with conv forward ( #1320 )
...
* Integrate universal gemm with conv fwd
* Fix conv fwd wmma test
* Fix instances
* Remove direct load check
2024-06-05 13:01:29 -05:00
dependabot[bot]
ba82beb9bf
Bump rocm-docs-core from 1.2.1 to 1.3.0 in /docs/sphinx ( #1324 )
...
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core ) from 1.2.1 to 1.3.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases )
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md )
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v1.2.1...v1.3.0 )
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-05 07:36:39 -07:00
Rostyslav Geyyer
cb0645bedc
Add a scale op, related instances and examples ( #1242 )
...
* Add a scale op
* Update the element op
* Add instances
* Add an example
* Add a client example
* Add a flag check
* Revert flag check addition
* Fix flag check
* Update d strides in example
* Update d strides in client example
* Apply suggestions from code review
Update copyright header
Co-authored-by: Bartłomiej Kocot <barkocot@amd.com >
* Move the example
* Move the client example
* Update element op
* Update example with the new element op
* Add scalar layout
* Update example
* Update kernel for scalar Ds
* Revert kernel changes
* Update element op
* Update example to use scales' pointers
* Format
* Update instances
* Update client example
* Move element op to unary elements
* Update element op to work with values instead of pointers
* Update instances to take element op as an argument
* Update examples to use random scale values
---------
Co-authored-by: Bartłomiej Kocot <barkocot@amd.com >
2024-06-04 19:28:15 -05:00
Dan Yao
2cab8d39e3
CK Tile FA Training kernels ( #1286 )
...
* FA fwd dropout
* FA bwd
* epilogue reuse
* CMakeLists update
* [CK_TILE] support alibi (#1269 )
* add alibi support
* fix code
* update code based on comment
* Support more hdim
* fix fp8 bias
* support seqlen_k=0 case
* remove unused printf
* fix format
---------
Co-authored-by: rocking <ChunYu.Lai@amd.com >
* now fwd/bwd can build
* bwd alibi
* add bwd validation stream_config
* update generated filenames
* update bwd kernel launch
* CK_TILE_HOST_DEVICE in philox
* Transpose -> transpose
* format
* format
* format
* Generate the instance for FA required
* format
* fix error in WarpGemm
---------
Co-authored-by: danyao12 <danyao12>
Co-authored-by: carlushuang <carlus.huang@amd.com >
Co-authored-by: rocking <ChunYu.Lai@amd.com >
Co-authored-by: Po Yen Chen <PoYen.Chen@amd.com >
Co-authored-by: Jing Zhang <jizhan@amd.com >
2024-06-04 13:12:45 -05:00
dependabot[bot]
76827d82ca
Bump rocm-docs-core from 1.2.0 to 1.2.1 in /docs/sphinx ( #1322 )
...
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core ) from 1.2.0 to 1.2.1.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases )
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md )
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v1.2.0...v1.2.1 )
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-03 22:41:56 -07:00
Illia Silin
3fa7e2a6c4
disable the hipTensor test by default, only run once daily ( #1321 )
2024-06-03 14:07:30 -07:00
zjing14
6fb1f4e03f
Post-merge fix of PR 1300 ( #1313 )
...
* add f8 gemm with multiD for both row/col wise
* change compute_type to fp8
* changed tuning parameters in the example
* add rcr example
* post-merge fix
* fix
* reduce init range
2024-05-31 22:46:41 -07:00
Illia Silin
34f3dfdd61
Build CK library for all supported targets. ( #1312 )
...
* test library build for all supported targets
* increase the number of threads to build lib in CI to 64
2024-05-28 12:36:06 -07:00
dependabot[bot]
66de8a02ba
Bump rocm-docs-core from 1.1.3 to 1.2.0 in /docs/sphinx ( #1311 )
...
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core ) from 1.1.3 to 1.2.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases )
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md )
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v1.1.3...v1.2.0 )
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-28 11:36:09 -07:00
zjing14
80db62f08d
add f8 gemm multiD with both row/col wise scale ( #1300 )
...
* add f8 gemm with multiD for both row/col wise
* change compute_type to fp8
* changed tuning parameters in the example
* add rcr example
2024-05-28 12:04:22 -05:00
carlushuang
5055b3bdcb
[CK_TILE] support group from cmdline ( #1295 )
...
* support cmdline seqlen decode
* silent print
* update readme
* update kernel launch 3d
* update tile partitioner
* fix spill for bf16
* modify based on comment
* modify payload_t
* fix bug for alibi mode
* fix alibi test err
* refactor kernel launch, support select timer
* add missing file
* remove useless code
* add some comments
2024-05-28 11:13:21 +08:00
Joseph Macaranas
02fa2c298b
Enable external CI pipeline triggers ( #1310 )
2024-05-23 18:21:34 -04:00
Illia Silin
ec2bae27ff
Split the gemm_multi_abd instances. ( #1306 )
...
* split the gemm_multi_abd instances
* update the dates
2024-05-23 09:17:02 -07:00
dependabot[bot]
06a9b72caf
Bump rocm-docs-core from 1.1.2 to 1.1.3 in /docs/sphinx ( #1308 )
...
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core ) from 1.1.2 to 1.1.3.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases )
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md )
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v1.1.2...v1.1.3 )
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-23 07:45:53 -07:00
Max Podkorytov
29e58d5b28
Make the library which generates CK instances for pytorch2 inductor's CK backend usage
...
Also bundle the CK library and include files with the pip package.
The package is pip-installable with
`pip install
git+https://github.com/tenpercent/composable_kernel@enable-pip `
(substitute the repo path and branch if necessary)
Testing:
`myenv/bin/python3 -m ck4inductor.universal_gemm.gen_instances`
(prints a list of instances)
`tree myenv/lib/python3.12/site-packages/ck4inductor`
(observe the list of sources along the installed package)
2024-05-22 13:44:22 -07:00
Bartłomiej Kocot
fd72380aeb
Optimize grouped conv bwd weight for small M and N ( #1303 )
...
* Optimize grouped conv bwd weight for small M and N
* Fixes
2024-05-22 21:01:01 +02:00
Illia Silin
7b027d5643
Select appropriate GPU targets for instances, tests, and examples. ( #1304 )
...
* set individual gpu targets for instances, examples, tests
* fix path to hip compiler
* fix path to hip compiler once more
* aggregate device macros in ck_tile config header
* fix the cmake logic for instances
* fix clang format
* add gfx900 and gfx906 to default set of targets
2024-05-22 11:45:27 -07:00
Rostyslav Geyyer
204da9c522
Move grouped conv fwd client examples ( #1299 )
...
* Move grouped conv fwd client examples
* Update existing examples
* Format
2024-05-21 09:52:41 -05:00
Illia Silin
06b891c5c2
aggregate device macros in ck_tile config header ( #1297 )
2024-05-20 08:34:45 -07:00
Illia Silin
1274861a9d
replace the ENV macro with CK_ENV ( #1296 )
2024-05-17 10:42:51 -07:00
dependabot[bot]
6637a810d0
Bump rocm-docs-core from 1.1.1 to 1.1.2 in /docs/sphinx ( #1293 )
...
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core ) from 1.1.1 to 1.1.2.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases )
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md )
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v1.1.1...v1.1.2 )
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-17 07:44:48 -07:00
rocking
aaa8dfdae9
Fix compile error ( #1292 )
...
error: no viable conversion from returned value of type '__half' to function return type 'fp16_hip_t' (aka '_Float16')
Co-authored-by: carlushuang <carlus.huang@amd.com >
2024-05-17 17:19:17 +08:00
Illia Silin
c44137838e
remove wrong use of nonexistent class members ( #1290 )
2024-05-15 08:08:17 -07:00
carlushuang
dd0dd13d4e
remove operator-deref ( #1291 )
2024-05-15 08:06:50 -07:00
jakpiase
3e3471d5d2
Add unit tests for grouped gemm two stage ( #1256 )
...
* add unit tests for grouped gemm two stage
* add reviewers suggestions
---------
Co-authored-by: Adam Osewski <19374865+aosewski@users.noreply.github.com >
2024-05-15 10:03:39 +02:00
Illia Silin
7843a8a7fb
re-enable convnd_fwd_xdl_fp64 testing ( #1289 )
2024-05-10 22:48:28 -07:00
Illia Silin
566b6480a2
Code clean-up ( #1285 )
...
* code clean-up
* remove the profiling output samples
2024-05-10 09:41:39 -07:00
carlushuang
fcba889ef4
[CK_TILE] fix some rand number init ( #1287 )
...
* add random norm
* normalized default to 0/3
* change squant->auto
2024-05-10 09:03:39 -07:00
Bartłomiej Kocot
8346af9c68
Change output gemm type to AccDataType in two stage conv bwd wei ( #1283 )
2024-05-10 10:57:42 +02:00
Adam Osewski
a0ae1c6133
Fix MakeArgument ( #1284 )
2024-05-09 09:42:41 -07:00
Adam Osewski
3c043cd10b
Add vector instruction coherency bits for gfx94 targets. ( #1268 )
2024-05-09 07:30:17 -07:00
Illia Silin
fdbf8ccbd7
fix the output formatting ( #1282 )
2024-05-08 16:11:54 -07:00
Bartłomiej Kocot
0b6b5d1785
Add two stage grouped conv bwd weight kernel ( #1280 )
2024-05-08 09:53:24 +02:00
Illia Silin
bf42097646
Enable logging in CK with environment variable. ( #1278 )
...
* enable logging using environment variable
* update ck.hpp header
* fix typo
* fix clang format
* Update include/ck/utility/env.hpp
Co-authored-by: Bartłomiej Kocot <barkocot@amd.com >
---------
Co-authored-by: Bartłomiej Kocot <barkocot@amd.com >
2024-05-07 16:26:43 -07:00
carlushuang
851c3ed157
[CK_TILE] support alibi ( #1269 )
...
* add alibi support
* fix code
* update code based on comment
* Support more hdim
* fix fp8 bias
* support seqlen_k=0 case
* remove unused printf
* fix format
---------
Co-authored-by: rocking <ChunYu.Lai@amd.com >
2024-05-07 22:32:54 +08:00
Sam Wu
6d073d31bb
Add ROCm Doc team as codeowners for RTD yaml ( #1277 )
...
Also add component owners as codeowners for header directory
2024-05-06 10:07:39 -06:00
Illia Silin
08d51d9bc4
add missing vector header ( #1275 )
2024-05-02 11:27:59 -07:00
Illia Silin
7797f7c7a1
Downgrade minimum required python version to 3.6 ( #1274 )
2024-05-01 15:34:56 -07:00
Illia Silin
f0bf1e3125
[CI] Focus CI stages on MI200 nodes for resource optimization ( #1273 )
2024-05-01 10:07:14 -07:00
Rostyslav Geyyer
a2d0bdd5a9
Add an ignore ( #1270 )
2024-04-30 20:45:22 -07:00
Sam Wu
43579900a9
Update documentation requirements and configurations ( #1272 )
...
* Update documentation requirements
Set rocm-docs-core to v1.1.1
* Update RTD config
Set Python 3.10 for rocm-docs-core >= v1.0.0
2024-04-30 20:44:59 -07:00