PoYen, Chen
8779716403
Fix uneven split checking logic
2024-08-06 01:17:14 +00:00
PoYen, Chen
77dac7775c
Move V tile through TileWindowNavigator
2024-08-05 22:36:52 +00:00
PoYen, Chen
ab086bdb76
Simplify more make_tile_window() overloads
2024-08-05 22:16:24 +00:00
PoYen, Chen
bb78353264
Remove ununnecessary data members
2024-08-05 21:52:59 +00:00
PoYen, Chen
3fc7279519
Disable calling fmha_fwd()
2024-08-05 21:36:52 +00:00
PoYen, Chen
8fea4139df
Fix tile window navigation bugs
2024-08-05 21:34:15 +00:00
PoYen, Chen
ecaaa6f136
Simplify TileWindowNavigator interfaces
2024-08-05 16:31:31 +00:00
PoYen, Chen
1c9d77b606
Introduce 'TileWindowNavigator' types
2024-08-05 15:58:41 +00:00
PoYen, Chen
55b77cf962
Add another make_tile_window()
2024-08-05 15:57:03 +00:00
PoYen, Chen
24cb604373
Add copy_const<> type trait
2024-08-05 15:56:15 +00:00
PoYen, Chen
90d84eaeae
Fix seqlen_k_min for pre-fill case (1 -> 0)
2024-08-04 02:53:40 +00:00
PoYen, Chen
381f7e90e0
Merge branch 'develop' into feature/fmha-fwd-appendkv
2024-08-04 02:12:20 +00:00
PoYen, Chen
baf4a612f0
Fix wrong kernel name
2024-08-02 10:26:47 +00:00
PoYen, Chen
db95d25d36
Launch splitkv kernel if given page_block_size
2024-08-02 10:26:09 +00:00
PoYen, Chen
e7969b9fd2
Add template argument 'kIsPagedKV' for splitkv kernels
2024-08-02 10:14:51 +00:00
Illia Silin
d311c95396
Add compiler flags for ROCm versions 6.2+ ( #1429 )
...
* add compiler flags to fix compiler issues
* fix typo.
* disable test_smfmac_op on all devices except gfx942
* specify full path to compiler in CI
2024-08-01 08:27:52 -07:00
Sam Wu
6648fd3b04
Update doc requirements ( #1423 )
2024-07-31 07:42:42 -07:00
zjing14
f31e8dfa80
[HotFix] Fixed a typo in profile_gemm_multiply_multiply ( #1425 )
...
* fixed a typo
* clean
---------
Co-authored-by: Jing Zhang <jizhan@fb.com >
2024-07-31 07:19:17 -07:00
arai713
d32997a792
Codegen: isSupportedArgument check ( #1417 )
...
* added isSupportedArgument check into codegen device op
* adding function call
* remove commented code
2024-07-31 07:12:15 -07:00
carlushuang
b3f86e79dd
workaround rocm-6.2 compiler issue ( #1421 )
2024-07-31 16:03:59 +08:00
PoYen, Chen
3f7199873c
Merge branch 'develop' into feature/fmha-fwd-appendkv
2024-07-31 04:42:41 +00:00
Illia Silin
b527cad4a5
add docker for rocm6.2_rc4 compiler ( #1424 )
2024-07-30 11:55:33 -07:00
Bartłomiej Kocot
33b399cc15
Revert Revert Support access per groups and filter2x3 in grouped conv fwd ( #1382 ) ( #1406 ) ( #1415 )
2024-07-30 18:36:04 +02:00
Po Yen Chen
08d82ee264
Merge branch 'develop' into feature/fmha-fwd-appendkv
2024-07-30 17:55:09 +08:00
dependabot[bot]
b9ba5b2676
Bump rocm-docs-core from 1.6.0 to 1.6.1 in /docs/sphinx ( #1420 )
...
Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core ) from 1.6.0 to 1.6.1.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases )
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md )
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.6.0...v1.6.1 )
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-26 14:47:19 -07:00
PoYen, Chen
e688d99495
Merge remote-tracking branch 'origin/develop' into feature/fmha-fwd-appendkv
2024-07-26 07:14:59 +00:00
PoYen, Chen
94f430de32
Update rotary_dim range in smoke_test_fwd.sh
2024-07-26 07:13:25 +00:00
PoYen, Chen
c1c50ee498
Enlarge KPerThread for rotary_interleaved=false
2024-07-26 07:09:53 +00:00
PoYen, Chen
d41ff70db5
Enlarge rotary_dim limit (8 -> 16)
2024-07-26 06:51:24 +00:00
trixirt
733f33af78
Introduce cmake USE_GLIBCXX_ASSERTIONS option ( #1404 )
...
A standard option in Fedora packaging that is used to check
the correctness of c++ use of the standard c++ library.
Signed-off-by: Tom Rix <trix@redhat.com >
Co-authored-by: Illia Silin <98187287+illsilin@users.noreply.github.com >
2024-07-25 19:28:17 -07:00
zjing14
105bd708c7
Add rotating buff for gemm_multi_d ( #1411 )
...
* add rotating_buff for gemm_multi_d
* format
* Update flush_cache.hpp
* Update gtest.cmake
---------
Co-authored-by: Jing Zhang <jizhan@fb.com >
Co-authored-by: Haocong WANG <haocwang@amd.com >
2024-07-25 23:21:21 +08:00
dependabot[bot]
1208082e53
Bump rocm-docs-core from 1.5.1 to 1.6.0 in /docs/sphinx ( #1416 )
...
Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core ) from 1.5.1 to 1.6.0.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases )
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md )
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.5.1...v1.6.0 )
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-24 22:56:29 -07:00
Andriy Roshchenko
4a8a1befd5
Adding more instances of grouped convolution 3d forward for FP8 with ConvScale+Bias element-wise operation. ( #1412 )
...
* Add CMakePresets configurations.
* Add binary elementwise ConvScaleAdd and an example.
* Numerical verification of results.
Observed significant irregularities in F8 to F32 type conversions:
```log
ConvScaleAdd: float=145.000000 f8_t=160.000000 e=144.000000
ConvScaleAdd: float=97.000000 f8_t=96.000000 e=104.000000
ConvScaleAdd: float=65.000000 f8_t=64.000000 e=72.000000
```
* Implemented ConvScaleAdd + Example.
* Add ConvScale+Bias Instances
* Add Client Example for ConvScale+Bias
* Fix number of bytes in an example..
* Cleanup.
2024-07-24 15:49:55 -05:00
Bartłomiej Kocot
ffabd70a15
Add support for half_t and bfloat to reduction operations ( #1395 )
...
* Add support for half_t and bfloat to reduction operations
* Fix bhalf convert
* Next fix bf16
2024-07-24 12:12:37 -05:00
dependabot[bot]
33b2a2bdf5
Bump rocm-docs-core from 1.5.0 to 1.5.1 in /docs/sphinx ( #1414 )
...
Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core ) from 1.5.0 to 1.5.1.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases )
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md )
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.5.0...v1.5.1 )
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-24 07:10:50 -07:00
PoYen, Chen
4280a07d2a
Refine pipeline padding settings
2024-07-24 11:37:56 +00:00
PoYen, Chen
f053ae2b5b
Add missing init code
2024-07-24 07:12:06 +00:00
PoYen, Chen
bd28e96425
Remove no-longer used method in pipeline
2024-07-24 06:59:45 +00:00
PoYen, Chen
c50c36a07f
Re-arrange the 'set +x' command
2024-07-24 06:56:53 +00:00
PoYen, Chen
8fb015b83f
Remove more debug statements
2024-07-24 06:48:40 +00:00
PoYen, Chen
5c733dc568
Remove debug statements
2024-07-24 06:47:52 +00:00
PoYen, Chen
2126d4d88d
Add append-kv smoke tests
2024-07-24 06:35:53 +00:00
PoYen, Chen
f7fb3fafaa
Allow only apply RoPE on Q (without append KV)
2024-07-24 06:26:00 +00:00
PoYen, Chen
08b4e8a125
Fix wrong rope key for fp8 pipeline
2024-07-24 06:06:07 +00:00
PoYen, Chen
d84c915549
Disable host verification if API not exist
2024-07-24 06:02:41 +00:00
PoYen, Chen
8a73d334b8
Rename utility function
2024-07-24 05:19:05 +00:00
PoYen, Chen
d59e098ec4
Fix wrong pipeline
2024-07-24 05:17:57 +00:00
PoYen, Chen
29c9b650b5
Align commit message to the real comment
2024-07-24 05:14:00 +00:00
PoYen, Chen
c7b7b44883
Add comment for why I just 't' for all padding flags
2024-07-24 05:13:16 +00:00
PoYen, Chen
59e1d9b84f
Shift rotary_cos/rotary_sin by cache_seqlen_k
2024-07-24 05:06:47 +00:00