Commit Graph

1455 Commits

Author SHA1 Message Date
Illia Silin
7aef3c6a2c only build tests and examples if user sets GPU_TARGETS (#1565)
[ROCm/composable_kernel commit: f46a9eee9d]
2024-10-10 15:31:56 -07:00
spolifroni-amd
d2d4e1db4a removed API usage header (#1566)
[ROCm/composable_kernel commit: 14c52befda]
2024-10-10 13:57:23 -07:00
Rostyslav Geyyer
d8ea58c184 Fix default stride value (#1559)
[ROCm/composable_kernel commit: d18fc0797f]
2024-10-10 07:37:09 -07:00
Thomas Ning
0d711b3edf Ck tile gemm cshuffle & CK Tile GEMM restructure (#1535)
* ake the cshuffle compilable

* modify Mhe reference on gpu and cpu. Correaccess of cshuffle

* fix the cpu reference code

* Complete the in tile shuffle logic

* restructure the kernel template input

* change the naming pattern of ck_tile gemm pipeline

* Re-format files using remod.py

* Solve the fmha conflict with gemm

* Comment Addressed from Carlus

---------

Co-authored-by: Po Yen, Chen <PoYen.Chen@amd.com>

[ROCm/composable_kernel commit: 6f27bc9872]
2024-10-10 18:02:22 +08:00
Illia Silin
1d89041e71 fix the target selection logic (#1561)
[ROCm/composable_kernel commit: 2e1165c1a7]
2024-10-09 15:21:57 -07:00
Illia Silin
985156e572 remove gfx12 targets from daily builds with rocm6.2 (#1560)
[ROCm/composable_kernel commit: cfac9497e2]
2024-10-09 10:18:05 -07:00
Christopher Millette
d6eae63f60 Fixes small memory leak from missing hipEventDestroy (#1554)
[ROCm/composable_kernel commit: ceaed8e097]
2024-10-09 09:41:35 +02:00
Rostyslav Geyyer
6dfbf61cf7 Add a gpu gemm reference kernel (#1528)
* Add a gpu gemm reference kernel

* Switch to gpu reference in gemm examples

* Remove redundant arguments

* Update all related examples

* Update more examples

* Try less threads per block

* Try even less threads per block

* Add support for all matrix layouts

* Increase block size

* Clean up

* Remove hardcoded strides

* Clean up

* Try a column-major case

* Revert back to row-major

* Run both CPU and GPU veriffication

---------

Co-authored-by: Po Yen Chen <PoYen.Chen@amd.com>

[ROCm/composable_kernel commit: aa932445ea]
2024-10-08 11:05:28 -05:00
Po Yen Chen
50f0f55fbc [CK_TILE] Update example README files & fix script compatibility issue (#1548)
* Fix text alignment of ArgParser::print()

* Update example README files

* Clarify make-ck-dev.sh <arch> usage

* Only keep some of the argument from '-?' output

* Undo command line output changes in README

* Only keep existing argument on doc and update description

* Fix text alignment

* Make cmake-ck-*.sh compatible with 'sh' command

[ROCm/composable_kernel commit: 0c094daa7e]
2024-10-08 10:45:12 +08:00
Qianfeng
1ca2b3d76c [CK_TILE] Simplify the codes in splitkv_combine pipeline (#1549)
* Simplify the codes in splitkv_combine pipeline

* Always set kPadSeqLenK=true for fmha splitkv kernels

* Change in Oacc Alignment and TileDistribution to be more adaptable to tile sizes

---------

Co-authored-by: Po Yen Chen <PoYen.Chen@amd.com>

[ROCm/composable_kernel commit: 74d68e3b99]
2024-10-08 10:44:34 +08:00
Illia Silin
58f4d92899 add a CK_USE_CODEGEN build argument to enable codegen (#1552)
* add a CK_USE_CODEGEN build argument to enable codegen

* fix cmake codegen logic

[ROCm/composable_kernel commit: 7733ae167b]
2024-10-07 15:45:19 -07:00
Illia Silin
881bc2c930 Fix build logic using GRU_ARCHS. (#1536)
* update build logic with GPU_ARCHS

* fix the GPU_ARCHS build for codegen

* unset GPU_TARGETS when GPU_ARCHS are set

[ROCm/composable_kernel commit: 7d8ea5f08b]
2024-10-07 08:18:23 -07:00
Bartłomiej Kocot
4aaf6ad633 [CK_TILE] Fix conv param multiple definition (#1550)
Co-authored-by: Po Yen Chen <PoYen.Chen@amd.com>

[ROCm/composable_kernel commit: cc8f466a7e]
2024-10-07 15:21:21 +02:00
rocking
36b2a932b0 [Ck tile] Support layernorm one pass (#1512)
* Fix compile error

* Add one pass pipeline

* Extract creating tile_window to operator()

* clang format

* reduce duplicated code

* do not hardcode

* Support padding in layernorm

---------

Co-authored-by: Po Yen Chen <PoYen.Chen@amd.com>

[ROCm/composable_kernel commit: 0023f01ab0]
2024-10-07 14:25:53 +08:00
kylasa
6f048f54dc Adding seed and offset pointer support to the philox random number generator. (#1523)
* Adding seed and offset pointer support to the philox random number generator.

* Separating seed and offset pointer checks with different condition statements.

* Changes include, adding support for device seed and offset pointers, union is used to store seed/offset values and device pointers to minimize device SGPRs.

* Correcting a typo in the readme file

* Re-format files using remod.py

* Use STL type for API parameters

* Use simpler struct design for drop_seed & drop_offset

* Undo unnecessary changes

* Sync kargs style for fmha_fwd.hpp/.cpp

* Use templated union to reduce code

* Use structured binding to make code more readable

---------

Co-authored-by: Sudhir Kylasa <sukylasa@amd.com>
Co-authored-by: Po Yen Chen <PoYen.Chen@amd.com>

[ROCm/composable_kernel commit: c24fae2346]
2024-10-05 02:48:47 +08:00
arai713
d550673b54 Codegen build (#1526)
* updating codegen build for MIOpen access: adding .cmake for codegen component

(cherry picked from commit 652a7c0463)

* updating CMake

(cherry picked from commit a685822e36)

[ROCm/composable_kernel commit: b545de175a]
2024-10-04 10:51:50 -07:00
Bartłomiej Kocot
47a2eb1cce Fix grouped gemm check to avoid overflow (#1545)
[ROCm/composable_kernel commit: 6b54d2faf8]
2024-10-04 17:32:43 +02:00
macurtis-amd
164963bf83 Fix compilation errors generated by forthcoming Clang changes (#1544)
Without this change, the following diagnostic is generated:
  a template argument list is expected after a name prefixed by the template
  keyword [-Wmissing-template-arg-list-after-template-kw]

See C++17 spec [temp.names] p5.

[ROCm/composable_kernel commit: aeb7c91f48]
2024-10-02 13:56:22 -07:00
BrianHarrisonAMD
51801c7b86 Add generating mha static library for gfx90a (#1540)
* Add generating mha static library for gfx90a

* Update comment to reflect changes

[ROCm/composable_kernel commit: 294cb82314]
2024-10-02 09:26:11 -07:00
Illia Silin
7394671138 re-enable the FMHA performance monitoring (#1539)
[ROCm/composable_kernel commit: 11b7a4db00]
2024-10-01 13:17:55 -07:00
Illia Silin
ef193e048a [CK_TILE] add missing vector header (#1537)
* add missing vector header

* Re-format header using remod.py

---------

Co-authored-by: Po Yen, Chen <PoYen.Chen@amd.com>

[ROCm/composable_kernel commit: 8e4c3fb1bc]
2024-10-01 07:58:20 -07:00
Po Yen Chen
1c4c07c669 [CK_TILE] Change output accum tensor layout of fmha fwd split-kv & combine kernels (#1527)
* Use same layout for o_acc and o tensor

* Use better param names in partitioner

* Remove redundant kargs 'max_seqlen_q'

* Use better param names in splitkv kernel

* Add comment for additional kernel arguments

* Sync empty loop early return logics between pipelines

* Pass more arguments to cmake in scripts

* Align backslashes

* Fix wrong o_acc tensor view strides

* Change o_acc layout if o_perm=0

* Handle whole row masked via attn_bias

* Use use vector width = 1 for o_acc

* Use more even split sizes

[ROCm/composable_kernel commit: a1c07e8d91]
2024-10-01 22:13:52 +08:00
M.Emin Ozturk
628aebb4ce Complex Contraction CK Bilinear Example (#1061)
* complex type contraction

* bug fix

* update

* Tensor Contraction Complex Data Type is working

* 4D Kernel

* some change

* validation check in progress

* validation issue

* fp32 verification error is fixed

* fp32 and fp64 are done

* remove old files

* remove cmake files

* remove cmake files

* Readme

* img verification

* CMakeList

* number changed

---------

Co-authored-by: Illia Silin <98187287+illsilin@users.noreply.github.com>
Co-authored-by: Emin Ozturk <emin.ozturk@utah.edu>

[ROCm/composable_kernel commit: 4cd1dc7f06]
2024-09-30 21:05:42 -06:00
Bartłomiej Kocot
da3172955b [CK_TILE] Image to Column kernel (#1532)
* [CK_TILE] Image to Column kernel

* Fixes

* Vector loads and stores

* Fixes

* Fixes

* change test dir name

[ROCm/composable_kernel commit: de3e3b6424]
2024-09-27 22:57:38 +02:00
Dan Yao
7460d19460 [CK_TILE] Fix compiler related FA bwd issues (#1530)
* add barriers

* tail bias barriers

* adjust bf16/hd256 tol

* continue adjust bf16/hd256 tol

[ROCm/composable_kernel commit: 9d69a099a4]
2024-09-26 12:18:39 -07:00
Illia Silin
04c756ea93 Fix compilation errors with Clang20.0. (#1533)
* fix clang20 compilation errors for gfx90a

* fix clang20 compilation errors for gfx11 targets

[ROCm/composable_kernel commit: 42e6dceacc]
2024-09-25 13:45:38 -07:00
Illia Silin
299db8fc86 make CK CI use different git credentials (#1529)
[ROCm/composable_kernel commit: 65f8d1440f]
2024-09-25 09:05:48 -07:00
dependabot[bot]
b939d0f014 Bump rocm-docs-core from 1.8.1 to 1.8.2 in /docs/sphinx (#1531)
Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core) from 1.8.1 to 1.8.2.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/v1.8.2/CHANGELOG.md)
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.8.1...v1.8.2)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

[ROCm/composable_kernel commit: 1c5a4d1b9f]
2024-09-25 06:59:25 -07:00
BrianHarrisonAMD
6be3ee0e77 Add additional instances to device_mha_instance (#1522)
* Add additional instances to device_mha_instance

* Add comment to describe what receipt 3 option filters

---------

Co-authored-by: Po Yen Chen <PoYen.Chen@amd.com>

[ROCm/composable_kernel commit: 3528a523ff]
2024-09-24 10:15:30 -06:00
Illia Silin
f969220365 Add a daily CI build with legacy dockers. (#1525)
* add an option to build CK with legacy dockers

* change the custom docker settings

* add environment varianble for custom docker

* use a new variable for legacy docker name

* new way to pass docker names for legacy OS

* add legacy docker check in the Build_CK function

* change groovy syntax

* add a check for legacy docker in getDockerImage

* make sure the legacy docker name is not empty

* remove the dumb-init call

* disable the tests in legacy OS dockers

* disable tests in legacy dockers

* use a different way to disable tests in legacy dockers

* rearrange the CI stages for legacy OS

* use different way to disable tests in legacy dockers

* update LD_LIBRARY_PATH for legacy dockers and add cron job

* update LD_LIBRARY_PATH at docker launch

* change the sytax for setting LD_LIBRARY_PATH

[ROCm/composable_kernel commit: f16ebf82d4]
2024-09-23 09:03:55 -07:00
Po Yen Chen
53b581e122 Early return if seqlen_k=0 on group mode (#1524)
[ROCm/composable_kernel commit: 770d2b7725]
2024-09-22 20:05:58 +08:00
Bartłomiej Kocot
e4f4e04add Add support for NGCHW in grouped conv fwd (#1499)
* Support NGCHW in grouped conv fwd

* Remove not needed variable

* Fixes

[ROCm/composable_kernel commit: 4ba52b35dc]
2024-09-20 10:45:46 +02:00
Adam Osewski
f6c6c375db Remove unsupported (fp8) type from Add memory operation. (#1521)
The dynamic buffer doesn't have support for fp8 in `Update` operation thus fp8 is not supporting `InMemoryDataOperation::Add`

[ROCm/composable_kernel commit: 0c39954da9]
2024-09-20 09:40:45 +02:00
Thomas Ning
2ded318de8 Ck tile gemm padding dim (#1516)
* Support the N dimension padding

* Finished the padding feature for different dimension of K

[ROCm/composable_kernel commit: 694c300145]
2024-09-18 11:32:29 -07:00
dependabot[bot]
3cdccec0d9 Bump rocm-docs-core from 1.8.0 to 1.8.1 in /docs/sphinx (#1519)
Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core) from 1.8.0 to 1.8.1.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/v1.8.1/CHANGELOG.md)
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.8.0...v1.8.1)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

[ROCm/composable_kernel commit: e84adec3ba]
2024-09-18 07:00:26 -07:00
Illia Silin
8815f1f974 Add rocm6.3_rc1 docker image (#1518)
* add image for rocm6.3_rc1

* fix deb package url

[ROCm/composable_kernel commit: 1658c0dc11]
2024-09-17 15:59:26 -07:00
aledudek
bacec21478 Extend pool3d fwd avg, max operations by f8_t, int8_t types (#1483)
* Extend pool3d fwd avg, max operations by f8_t, int8_t types

* Pack MaxPool3dFwd params together

* Fix MaxPool3dFwd AVG instances

* Decrease verification precision for bf16

* Adjust tests + review changes

* Adjust threshold for F8

* Adjusted compute types for MAX op instances

* Fix ComputeDataType mismatch in tests and profiler for AVG

* Fix naming from max_pool3d_fwd to pool3d_fwd

* Adjust CMakeLists

---------

Co-authored-by: Adam Osewski <19374865+aosewski@users.noreply.github.com>

[ROCm/composable_kernel commit: a793afc961]
2024-09-17 15:57:10 +02:00
dependabot[bot]
771d7912ff Bump rocm-docs-core from 1.7.2 to 1.8.0 in /docs/sphinx (#1517)
Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core) from 1.7.2 to 1.8.0.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.7.2...v1.8.0)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

[ROCm/composable_kernel commit: 8ec15e644e]
2024-09-16 22:37:57 -07:00
Mateusz Ozga
1e907323e5 This commit contains implementation of max pool2d for f8 type (#1506)
Co-authored-by: Adam Osewski <19374865+aosewski@users.noreply.github.com>

[ROCm/composable_kernel commit: 6834e5ee74]
2024-09-16 10:15:06 +02:00
Thomas Ning
84f3413bb2 Ck tile GPU verification sample develop & Add the CK TILE GEMM to the CI/CD test (#1505)
* Finished the feature of gpu verification

* Add the ck_tile_gemm test in the CI CD

* add the include of tensor_layou in reference_gemm

* Comment Addressed

* split ck_tile fhma and gemm tests into separate stages

* restructure the reference gemm

* restructure a new reference_gemm api that could read the device mem

---------

Co-authored-by: carlushuang <carlus.huang@amd.com>
Co-authored-by: illsilin <Illia.Silin@amd.com>

[ROCm/composable_kernel commit: 844f5a1712]
2024-09-14 21:08:40 +08:00
bibek
d40f059978 Fix duplicate CMake tidy-target issue (#1513)
[ROCm/composable_kernel commit: 49e012dee1]
2024-09-13 21:15:04 -07:00
jakpiase
8a7171c39f Add pool2d int8 and fp8 instances (#1508)
* add pool2d fp8 and int8

* minor fixes

* add formatting

* add reviewer suggestions

* add reviewer suggestions

[ROCm/composable_kernel commit: 8f8a2ce396]
2024-09-13 10:18:21 -07:00
dependabot[bot]
efaa399e34 Bump sphinxcontrib-bibtex from 2.6.2 to 2.6.3 in /docs/sphinx (#1511)
Bumps [sphinxcontrib-bibtex](https://github.com/mcmtroffaes/sphinxcontrib-bibtex) from 2.6.2 to 2.6.3.
- [Changelog](https://github.com/mcmtroffaes/sphinxcontrib-bibtex/blob/develop/CHANGELOG.rst)
- [Commits](https://github.com/mcmtroffaes/sphinxcontrib-bibtex/compare/2.6.2...2.6.3)

---
updated-dependencies:
- dependency-name: sphinxcontrib-bibtex
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

[ROCm/composable_kernel commit: a4982c3b86]
2024-09-13 08:17:26 -07:00
Jun Liu
04a8584b87 Customize filesystem in CK for legacy systems (#1509)
* Legacy support: customized filesystem

* Update cmakefile for python alternative path

* fix build issues

* CK has no boost dependency

* More fixes to issues found on legay systems

* fix clang format issue

* Check if blob is correctly generated in cmake

* fix the python issues

* add a compiler flag for codegen when using alternative python

* use target_link_options instead of target_compile_options

---------

Co-authored-by: illsilin <Illia.Silin@amd.com>

[ROCm/composable_kernel commit: 81bc1496b2]
2024-09-13 07:51:07 -07:00
Illia Silin
30073ffa43 make sure to rebuild compilers if they changed (#1504)
[ROCm/composable_kernel commit: e07f1108c0]
2024-09-12 07:49:55 -07:00
Mateusz Ozga
9c0316d853 Pool2d max/avg kernel in the BWD version (#1494)
* Add pool2d instance BWD AVG

* Add pool2d instance BWD MAX

* Fix: avg review

* Fix review: part2

* Fix - enable test when type is compiled

* Fix review part3

[ROCm/composable_kernel commit: 448c0f56d8]
2024-09-12 11:47:52 +02:00
jakpiase
8aeb2afbe2 Rewrite pool2d fwd (#1462)
* added pool2d fwd

* add tests

* add reviewers changes

* Revert "Merge remote-tracking branch 'origin/develop' into jakpiase/pool2d_fwd_new"

This reverts commit 6b2ba7ff89, reversing
changes made to 22c82bea0c.

* Revert "add reviewers changes"

This reverts commit 22c82bea0c.

* added reviewers comments

* revert some old files

* add reviewers requests

---------

Co-authored-by: Adam Osewski <19374865+aosewski@users.noreply.github.com>

[ROCm/composable_kernel commit: e8d2887cb2]
2024-09-11 15:21:00 +02:00
jakpiase
681d36db5f Added structural sparsity blockwise gemm (#1435)
* Implemented smfmac xdlops

* Added smfmac blockwise xdlops

* fixes

* add reviewers suggestions

---------

Co-authored-by: Adam Osewski <19374865+aosewski@users.noreply.github.com>

[ROCm/composable_kernel commit: 2a261afcdf]
2024-09-11 15:19:42 +02:00
Dan Yao
df8769d3c8 [CK_TILE] FA bwd repair (#1502)
* fix fa bwd

* revert kernelBlockSize in gemm_kernel.hpp

[ROCm/composable_kernel commit: d09572e8c2]
2024-09-10 10:45:32 -07:00
Thomas Ning
a0dfee4f3c fix the unsupported scenario of Ali TestGemmUniversal (#1501)
[ROCm/composable_kernel commit: cf08df6b5e]
2024-09-09 11:31:27 -07:00