Commit Graph

1302 Commits

Author SHA1 Message Date
rocking
b64d3f6eec prevent warning in filter mode 2024-04-08 21:43:35 +00:00
rocking
525b89e538 1. codgen the f8 api and kernel
2. f8 host code
2024-04-08 21:36:23 +00:00
rocking
5860f3134a Merge branch 'ck_tile/refactor' into ck_tile/elementwise 2024-04-09 02:37:42 +08:00
Po Yen Chen
87f3cd1ddd Use CK_TILE_FLOAT_TO_FP8_STANDARD as default fp8 rounding mode 2024-04-08 12:39:58 +00:00
Po Yen Chen
e49498f616 Set fp8 rounding error for check_err() 2024-04-08 12:39:37 +00:00
Po Yen Chen
641ae96215 Check fp8 rounding error in check_err() 2024-04-08 12:39:27 +00:00
Po Yen Chen
92d45d1681 Fix wrong fp8 QK/KV block gemm setting 2024-04-08 12:39:17 +00:00
rocking
4e005f2457 Avoid warning 2024-04-08 10:11:51 +00:00
rocking
29a0670744 Remove remove_cvref_t 2024-04-08 10:03:48 +00:00
rocking
5c3fdeb0b8 Remove f8 pipeline, we should share the same pipeline even in f8 2024-04-08 09:56:23 +00:00
rocking
f7d81364f3 To prevent compiler issue, remove the elementwise function we have not used. 2024-04-08 09:44:21 +00:00
carlushuang
42ebffe822 1).support receipe in generate.py 2).use simplified mask type 3).change left/right to pass into karg 2024-04-07 23:30:34 +00:00
carlushuang
8050921512 Merge branch 'develop' into ck_tile/refactor 2024-04-05 20:49:13 +08:00
Illia Silin
7e5c81fed2 fix the latest errors with staging compiler (#1229) 2024-04-04 11:33:29 -07:00
jakpiase
c701071666 Add Grouped Gemm Multiple D SplitK TwoStage (#1212)
* Support A/B/C elementwise ops.

* First part of GGEMM multiD splitk two stage.

* WIP - changes for debuggin.

* tmp save

* working version

* added bf16@int8 version

* fixes

* add reviewers sugestions

* pre-commited missing files

* switched to ifs from elseifs

---------

Co-authored-by: Adam Osewski <Adam.Osewski@amd.com>
2024-04-04 11:01:33 +02:00
rocking
68153dea0b Let generate.py can generate different elementwise function 2024-04-04 03:59:38 +00:00
rocking
d6cb104d0f Add some elementwise op, prepare to quantization 2024-04-04 03:18:39 +00:00
rocking
d9323ea261 Fix bug of elementwise op, our elementwise op is not inout 2024-04-04 03:17:36 +00:00
Rostyslav Geyyer
a61e73bc56 Add instances for conv_scale with fp8@bf8->fp8 (#1220)
* Update device op api to support BComputeType

* Add example

* Add instances

* Add profiler mode

* Add client example

* Update copyright year

* Add BComputeType check

* Fix compute types
2024-04-03 09:08:08 -05:00
carlushuang
06f1cabd78 Merge branch 'develop' into ck_tile/refactor 2024-04-03 20:51:01 +08:00
rocking
bfcf550305 Adjust P elementwise function 2024-04-03 11:07:21 +00:00
Bartłomiej Kocot
9a194837af Introduce combined elementwise ops (#1217)
* Introduce combined elementwise ops

* Introduce refrence elementwise
2024-04-02 17:23:49 -05:00
Illia Silin
ae57e5938e Split the instances by architecture. (#1223)
* parse examples inside the add_example_executable function

* fix the example 64 cmake file

* add xdl flag to the gemm_bias_softmax_gemm_permute example

* add filtering of tests based on architecture type

* enable test_grouped_gemm for gfx9 only

* enable test_transpose only for gfx9

* only linnk test_transpose if it gets built

* split the gemm instances by architectures

* split gemm_bilinear,grouped_conv_bwd_weight instances by targets

* split instances by architecture

* split grouped_conv instances by architecture

* fix clang format

* fix the if-else logic in group_conv headers

* small fix for grouped convolution instances

* fix the grouped conv bwd weight dl instances

* fix client examples

* only enable client examples 3 and 4 on gfx9

* set the gfx9 macro

* make sure the architecture macros are set by cmake

* use separate set of xdl/wmma flags for host code

* sinmplify the main cmake file

* add conv_fwd_bf8 instance declaration
2024-04-02 09:42:17 -07:00
zjing14
303d4594f4 improved zeroing (#1221) 2024-04-02 11:02:52 -05:00
rocking
cf57626c07 Merge branch 'ck_tile/refactor' into ck_tile/elementwise 2024-04-01 16:07:27 +08:00
carlushuang
42866940dc remove mistake 2024-03-31 00:01:30 +00:00
carlushuang
855a264b72 remove ck_tile example from default cmake target like all/install/check 2024-03-30 23:58:48 +00:00
rocking
286c74468d Add element function to fmha api 2024-03-29 18:05:36 -04:00
carlushuang
076da565dd let python version to be 3.8 as minimal 2024-03-29 17:07:23 +00:00
rocking
50c36f352a Add SAccElementFunction, PComputeElementFunction, OAccElementFunction in pipeline 2024-03-29 07:09:06 -04:00
carlushuang
f236a13d1b fix several issue 2024-03-28 22:00:11 +00:00
carlushuang
06c54880d1 Merge remote-tracking branch 'origin/develop' into ck_tile/refactor 2024-03-28 21:59:40 +00:00
dependabot[bot]
5f2c89e8b4 Bump rocm-docs-core from 0.37.1 to 0.38.0 in /docs/sphinx (#1218)
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.37.1 to 0.38.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.37.1...v0.38.0)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-03-27 10:23:54 -07:00
carlushuang
b0b8a5ad46 update README of ck_tile example 2024-03-26 18:57:29 +00:00
carlushuang
13311f2e5a fix clang-format 2024-03-26 18:53:10 +00:00
carlushuang
b9ed9c8e4d Merge remote-tracking branch 'origin/ck_tile_merge_public_develop' into ck_tile/refactor 2024-03-26 16:34:46 +00:00
carlushuang
ca941d66ef remove unused 2024-03-26 16:33:48 +00:00
carlushuang
97902de98c sync 22 2024-03-26 16:30:50 +00:00
carlushuang
f955af6ff7 sync upstream again 2024-03-26 16:25:32 +00:00
carlushuang
1c92c5d83d sync with upstream 2024-03-26 16:05:54 +00:00
carlushuang
04ee01191a fix merge from upstream 2024-03-26 14:09:54 +00:00
carlushuang
c94b545747 update some readme 2024-03-26 13:35:53 +00:00
carlushuang
200d2b22d4 fix scratch in fp8 kernel 2024-03-25 19:45:38 +00:00
Po-Yen, Chen
1cacb713c5 Default use CK_TILE_FLOAT_TO_FP8_STOCHASTIC rounding mode 2024-03-23 22:51:18 -04:00
Illia Silin
cc1f733d0e allow the CI to pass even if can't connect to db (#1214) 2024-03-22 15:39:11 -07:00
dependabot[bot]
2ae16e901f Bump rocm-docs-core from 0.37.0 to 0.37.1 in /docs/sphinx (#1211)
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.37.0 to 0.37.1.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.37.0...v0.37.1)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-03-22 07:58:36 -07:00
Bartłomiej Kocot
9c052804a7 Add elementwise with dynamic vector dim (#1198)
* Add elementwise with dynamic vector dim

* Reduce number of instaces

* Fixes

* Fixes
2024-03-22 10:40:43 +01:00
Rostyslav Geyyer
fd0d093e78 Add instances for conv_scale with bf8 in / fp8 out (#1200)
* Add bf8 conv fwd instances

* Add example

* Add profiler mode

* Add client example

* Fix copyright headers

* Format
2024-03-21 13:57:34 -05:00
dependabot[bot]
9e50426915 Bump rocm-docs-core from 0.36.0 to 0.37.0 in /docs/sphinx (#1208)
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.36.0 to 0.37.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.36.0...v0.37.0)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-03-20 09:28:03 -06:00
carlushuang
bb1f6e48eb fix fp8 duplicated move/shift/and/or problem 2024-03-19 23:29:57 +00:00