Mirza Halilcevic
47486abb67
Fix cmake.
2024-10-02 11:47:11 +00:00
Mirza Halilcevic
52426f8498
Separate ck_host lib and gemm_softmax_gemm into different PR.
2024-10-02 11:43:44 +00:00
Mirza Halilcevic
f52c2a4de6
Address PR comments.
2024-10-02 10:43:09 +00:00
Mirza Halilcevic
e3d444c8d7
Merge remote-tracking branch 'upstream/develop' into ck_migraphx_integration
2024-10-02 08:28:49 +00:00
Illia Silin
11b7a4db00
re-enable the FMHA performance monitoring ( #1539 )
2024-10-01 13:17:55 -07:00
Illia Silin
8e4c3fb1bc
[CK_TILE] add missing vector header ( #1537 )
...
* add missing vector header
* Re-format header using remod.py
---------
Co-authored-by: Po Yen, Chen <PoYen.Chen@amd.com >
2024-10-01 07:58:20 -07:00
Po Yen Chen
a1c07e8d91
[CK_TILE] Change output accum tensor layout of fmha fwd split-kv & combine kernels ( #1527 )
...
* Use same layout for o_acc and o tensor
* Use better param names in partitioner
* Remove redundant kargs 'max_seqlen_q'
* Use better param names in splitkv kernel
* Add comment for additional kernel arguments
* Sync empty loop early return logics between pipelines
* Pass more arguments to cmake in scripts
* Align backslashes
* Fix wrong o_acc tensor view strides
* Change o_acc layout if o_perm=0
* Handle whole row masked via attn_bias
* Use use vector width = 1 for o_acc
* Use more even split sizes
2024-10-01 22:13:52 +08:00
M.Emin Ozturk
4cd1dc7f06
Complex Contraction CK Bilinear Example ( #1061 )
...
* complex type contraction
* bug fix
* update
* Tensor Contraction Complex Data Type is working
* 4D Kernel
* some change
* validation check in progress
* validation issue
* fp32 verification error is fixed
* fp32 and fp64 are done
* remove old files
* remove cmake files
* remove cmake files
* Readme
* img verification
* CMakeList
* number changed
---------
Co-authored-by: Illia Silin <98187287+illsilin@users.noreply.github.com >
Co-authored-by: Emin Ozturk <emin.ozturk@utah.edu >
2024-09-30 21:05:42 -06:00
Bartłomiej Kocot
de3e3b6424
[CK_TILE] Image to Column kernel ( #1532 )
...
* [CK_TILE] Image to Column kernel
* Fixes
* Vector loads and stores
* Fixes
* Fixes
* change test dir name
2024-09-27 22:57:38 +02:00
Dan Yao
9d69a099a4
[CK_TILE] Fix compiler related FA bwd issues ( #1530 )
...
* add barriers
* tail bias barriers
* adjust bf16/hd256 tol
* continue adjust bf16/hd256 tol
2024-09-26 12:18:39 -07:00
Illia Silin
42e6dceacc
Fix compilation errors with Clang20.0. ( #1533 )
...
* fix clang20 compilation errors for gfx90a
* fix clang20 compilation errors for gfx11 targets
2024-09-25 13:45:38 -07:00
Illia Silin
65f8d1440f
make CK CI use different git credentials ( #1529 )
2024-09-25 09:05:48 -07:00
dependabot[bot]
1c5a4d1b9f
Bump rocm-docs-core from 1.8.1 to 1.8.2 in /docs/sphinx ( #1531 )
...
Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core ) from 1.8.1 to 1.8.2.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases )
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/v1.8.2/CHANGELOG.md )
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.8.1...v1.8.2 )
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-25 06:59:25 -07:00
Mirza Halilcevic
24608d4348
Merge branch 'ck_mgx_temp' into ck_migraphx_integration
2024-09-25 09:25:42 +00:00
Mirza Halilcevic
a4fe62edd3
Merge remote-tracking branch 'upstream/develop' into ck_migraphx_integration
2024-09-25 09:02:25 +00:00
Mirza Halilcevic
eaeb3dacec
Fix codegen build issues.
2024-09-25 09:00:39 +00:00
Mirza Halilcevic
d43cd4ad32
Introduce gemm_softmax_gemm to codegen.
2024-09-25 08:22:07 +00:00
BrianHarrisonAMD
3528a523ff
Add additional instances to device_mha_instance ( #1522 )
...
* Add additional instances to device_mha_instance
* Add comment to describe what receipt 3 option filters
---------
Co-authored-by: Po Yen Chen <PoYen.Chen@amd.com >
2024-09-24 10:15:30 -06:00
Illia Silin
f16ebf82d4
Add a daily CI build with legacy dockers. ( #1525 )
...
* add an option to build CK with legacy dockers
* change the custom docker settings
* add environment varianble for custom docker
* use a new variable for legacy docker name
* new way to pass docker names for legacy OS
* add legacy docker check in the Build_CK function
* change groovy syntax
* add a check for legacy docker in getDockerImage
* make sure the legacy docker name is not empty
* remove the dumb-init call
* disable the tests in legacy OS dockers
* disable tests in legacy dockers
* use a different way to disable tests in legacy dockers
* rearrange the CI stages for legacy OS
* use different way to disable tests in legacy dockers
* update LD_LIBRARY_PATH for legacy dockers and add cron job
* update LD_LIBRARY_PATH at docker launch
* change the sytax for setting LD_LIBRARY_PATH
2024-09-23 09:03:55 -07:00
Po Yen Chen
770d2b7725
Early return if seqlen_k=0 on group mode ( #1524 )
2024-09-22 20:05:58 +08:00
Bartłomiej Kocot
4ba52b35dc
Add support for NGCHW in grouped conv fwd ( #1499 )
...
* Support NGCHW in grouped conv fwd
* Remove not needed variable
* Fixes
2024-09-20 10:45:46 +02:00
Adam Osewski
0c39954da9
Remove unsupported (fp8) type from Add memory operation. ( #1521 )
...
The dynamic buffer doesn't have support for fp8 in `Update` operation thus fp8 is not supporting `InMemoryDataOperation::Add`
2024-09-20 09:40:45 +02:00
Thomas Ning
694c300145
Ck tile gemm padding dim ( #1516 )
...
* Support the N dimension padding
* Finished the padding feature for different dimension of K
2024-09-18 11:32:29 -07:00
dependabot[bot]
e84adec3ba
Bump rocm-docs-core from 1.8.0 to 1.8.1 in /docs/sphinx ( #1519 )
...
Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core ) from 1.8.0 to 1.8.1.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases )
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/v1.8.1/CHANGELOG.md )
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.8.0...v1.8.1 )
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-18 07:00:26 -07:00
Dino Musić
08255e1b45
Implement hiprtc for codegen tests
2024-09-18 11:23:20 +00:00
Illia Silin
1658c0dc11
Add rocm6.3_rc1 docker image ( #1518 )
...
* add image for rocm6.3_rc1
* fix deb package url
2024-09-17 15:59:26 -07:00
aledudek
a793afc961
Extend pool3d fwd avg, max operations by f8_t, int8_t types ( #1483 )
...
* Extend pool3d fwd avg, max operations by f8_t, int8_t types
* Pack MaxPool3dFwd params together
* Fix MaxPool3dFwd AVG instances
* Decrease verification precision for bf16
* Adjust tests + review changes
* Adjust threshold for F8
* Adjusted compute types for MAX op instances
* Fix ComputeDataType mismatch in tests and profiler for AVG
* Fix naming from max_pool3d_fwd to pool3d_fwd
* Adjust CMakeLists
---------
Co-authored-by: Adam Osewski <19374865+aosewski@users.noreply.github.com >
2024-09-17 15:57:10 +02:00
dependabot[bot]
8ec15e644e
Bump rocm-docs-core from 1.7.2 to 1.8.0 in /docs/sphinx ( #1517 )
...
Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core ) from 1.7.2 to 1.8.0.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases )
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md )
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.7.2...v1.8.0 )
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-16 22:37:57 -07:00
Mateusz Ozga
6834e5ee74
This commit contains implementation of max pool2d for f8 type ( #1506 )
...
Co-authored-by: Adam Osewski <19374865+aosewski@users.noreply.github.com >
2024-09-16 10:15:06 +02:00
Thomas Ning
844f5a1712
Ck tile GPU verification sample develop & Add the CK TILE GEMM to the CI/CD test ( #1505 )
...
* Finished the feature of gpu verification
* Add the ck_tile_gemm test in the CI CD
* add the include of tensor_layou in reference_gemm
* Comment Addressed
* split ck_tile fhma and gemm tests into separate stages
* restructure the reference gemm
* restructure a new reference_gemm api that could read the device mem
---------
Co-authored-by: carlushuang <carlus.huang@amd.com >
Co-authored-by: illsilin <Illia.Silin@amd.com >
2024-09-14 21:08:40 +08:00
bibek
49e012dee1
Fix duplicate CMake tidy-target issue ( #1513 )
2024-09-13 21:15:04 -07:00
jakpiase
8f8a2ce396
Add pool2d int8 and fp8 instances ( #1508 )
...
* add pool2d fp8 and int8
* minor fixes
* add formatting
* add reviewer suggestions
* add reviewer suggestions
2024-09-13 10:18:21 -07:00
dependabot[bot]
a4982c3b86
Bump sphinxcontrib-bibtex from 2.6.2 to 2.6.3 in /docs/sphinx ( #1511 )
...
Bumps [sphinxcontrib-bibtex](https://github.com/mcmtroffaes/sphinxcontrib-bibtex ) from 2.6.2 to 2.6.3.
- [Changelog](https://github.com/mcmtroffaes/sphinxcontrib-bibtex/blob/develop/CHANGELOG.rst )
- [Commits](https://github.com/mcmtroffaes/sphinxcontrib-bibtex/compare/2.6.2...2.6.3 )
---
updated-dependencies:
- dependency-name: sphinxcontrib-bibtex
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-13 08:17:26 -07:00
Jun Liu
81bc1496b2
Customize filesystem in CK for legacy systems ( #1509 )
...
* Legacy support: customized filesystem
* Update cmakefile for python alternative path
* fix build issues
* CK has no boost dependency
* More fixes to issues found on legay systems
* fix clang format issue
* Check if blob is correctly generated in cmake
* fix the python issues
* add a compiler flag for codegen when using alternative python
* use target_link_options instead of target_compile_options
---------
Co-authored-by: illsilin <Illia.Silin@amd.com >
2024-09-13 07:51:07 -07:00
Illia Silin
e07f1108c0
make sure to rebuild compilers if they changed ( #1504 )
2024-09-12 07:49:55 -07:00
Mateusz Ozga
448c0f56d8
Pool2d max/avg kernel in the BWD version ( #1494 )
...
* Add pool2d instance BWD AVG
* Add pool2d instance BWD MAX
* Fix: avg review
* Fix review: part2
* Fix - enable test when type is compiled
* Fix review part3
2024-09-12 11:47:52 +02:00
jakpiase
e8d2887cb2
Rewrite pool2d fwd ( #1462 )
...
* added pool2d fwd
* add tests
* add reviewers changes
* Revert "Merge remote-tracking branch 'origin/develop' into jakpiase/pool2d_fwd_new"
This reverts commit 6b2ba7ff89 , reversing
changes made to 22c82bea0c .
* Revert "add reviewers changes"
This reverts commit 22c82bea0c .
* added reviewers comments
* revert some old files
* add reviewers requests
---------
Co-authored-by: Adam Osewski <19374865+aosewski@users.noreply.github.com >
2024-09-11 15:21:00 +02:00
jakpiase
2a261afcdf
Added structural sparsity blockwise gemm ( #1435 )
...
* Implemented smfmac xdlops
* Added smfmac blockwise xdlops
* fixes
* add reviewers suggestions
---------
Co-authored-by: Adam Osewski <19374865+aosewski@users.noreply.github.com >
2024-09-11 15:19:42 +02:00
Dan Yao
d09572e8c2
[CK_TILE] FA bwd repair ( #1502 )
...
* fix fa bwd
* revert kernelBlockSize in gemm_kernel.hpp
2024-09-10 10:45:32 -07:00
Thomas Ning
cf08df6b5e
fix the unsupported scenario of Ali TestGemmUniversal ( #1501 )
2024-09-09 11:31:27 -07:00
Thomas Ning
caacd38830
Ck tile gemm example ( #1488 )
...
* Checkpoint: Finished with the tile example & kernel verification, working on the different matrix layout
* Finished the Matrix Layout feature set up. Note: Need to modify the inner block to solve the shuffle problem in the future.
* Fix: Clang Format, API fixed from fmha
* fix with better naming convention
* revert back the pipeline code of fmha
* Fixed: Addressed the comments and merge the GEMM shape of GEMM Operator and FMHA Operator to one.
* clang format with the reference_gemm file
* convert the clang format with the remod.py
* Changed the format and variable name of the kernel gemm_shape and partitioner
---------
Co-authored-by: thomasning <thomasning@banff-cyxtera-s70-4.ctr.dcgpu >
2024-09-07 16:23:32 +08:00
M.Emin Ozturk
8378855361
Moficiation to fix this issue "threadwise_tensor_slice_transfer_v5r1 issue #1279 " ( #1492 )
...
* issue fix, one line changed for tmp
* clang
---------
Co-authored-by: Emin Ozturk <emin.ozturk@utah.edu >
Co-authored-by: Harisankar Sadasivan <135730918+hsadasiv@users.noreply.github.com >
2024-09-04 21:52:55 -07:00
Haocong WANG
5b10dae6a4
Add gemm universal bf16 instances ( #1484 )
...
* revert ckprofiler change
* temp save
* Add test and test pass
* test pass
* Fix bug inside rotating buffer when tensor is not packed
* bug fix
* clang format
---------
Co-authored-by: Illia Silin <98187287+illsilin@users.noreply.github.com >
2024-09-04 20:58:54 -07:00
Rostyslav Geyyer
52410b49c7
Temporarily disable flaky test for all ( #1495 )
2024-09-04 07:36:57 -07:00
Illia Silin
8b95d9ad52
copy all fmha headers when building library ( #1497 )
...
* copy all fmha headers when building library
* fix the rocm_install call for mha headers
2024-09-04 07:36:41 -07:00
Illia Silin
841009c5ee
Add an option to select an alternative python version during build. ( #1496 )
...
* locate a newwer version of python when -DRHEL=ON flag is set
* allow setting python version on cmake command line
2024-09-04 07:36:27 -07:00
Bartłomiej Kocot
73b67f290f
Add support for NGCHW in grouped conv bwd wei ( #1491 )
...
* Add support for NGCHW in grouped conv bwd wei
* Comments fixes
* navi fixes
* Update function names
2024-09-03 10:52:03 +02:00
Bartłomiej Kocot
a9b170b541
Revert "Revert "Revert Revert Support access per groups and filter2x3 in grouped conv fwd ( #1382 ) ( #1406 ) ( #1415 )" ( #1455 )" ( #1490 )
...
This reverts commit 5ff8eeebf9 .
2024-09-02 10:39:49 +02:00
Dan Yao
b8addae293
[CK_TILE] float -> bf16 inline asm rtn ( #1482 )
...
* asm rtn
* add asm rtn macro
* reorder macro
---------
Co-authored-by: carlushuang <carlus.huang@amd.com >
2024-08-30 15:38:09 +08:00
Po Yen Chen
461ec98d78
Enable scratch memory workaround on ROCm 6.2 ( #1486 )
...
Co-authored-by: carlushuang <carlus.huang@amd.com >
2024-08-30 10:40:00 +08:00