Commit Graph

1465 Commits

Author SHA1 Message Date
PoYen, Chen
e688d99495 Merge remote-tracking branch 'origin/develop' into feature/fmha-fwd-appendkv 2024-07-26 07:14:59 +00:00
PoYen, Chen
94f430de32 Update rotary_dim range in smoke_test_fwd.sh 2024-07-26 07:13:25 +00:00
PoYen, Chen
c1c50ee498 Enlarge KPerThread for rotary_interleaved=false 2024-07-26 07:09:53 +00:00
PoYen, Chen
d41ff70db5 Enlarge rotary_dim limit (8 -> 16) 2024-07-26 06:51:24 +00:00
trixirt
733f33af78 Introduce cmake USE_GLIBCXX_ASSERTIONS option (#1404)
A standard option in Fedora packaging that is used to check
the correctness of c++ use of the standard c++ library.

Signed-off-by: Tom Rix <trix@redhat.com>
Co-authored-by: Illia Silin <98187287+illsilin@users.noreply.github.com>
2024-07-25 19:28:17 -07:00
zjing14
105bd708c7 Add rotating buff for gemm_multi_d (#1411)
* add rotating_buff for gemm_multi_d

* format

* Update flush_cache.hpp

* Update gtest.cmake

---------

Co-authored-by: Jing Zhang <jizhan@fb.com>
Co-authored-by: Haocong WANG <haocwang@amd.com>
2024-07-25 23:21:21 +08:00
dependabot[bot]
1208082e53 Bump rocm-docs-core from 1.5.1 to 1.6.0 in /docs/sphinx (#1416)
Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core) from 1.5.1 to 1.6.0.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.5.1...v1.6.0)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-24 22:56:29 -07:00
Andriy Roshchenko
4a8a1befd5 Adding more instances of grouped convolution 3d forward for FP8 with ConvScale+Bias element-wise operation. (#1412)
* Add CMakePresets configurations.

* Add binary elementwise ConvScaleAdd and an example.

* Numerical verification of results.

Observed significant irregularities in F8 to F32 type conversions:
```log
ConvScaleAdd: float=145.000000   f8_t=160.000000    e=144.000000
ConvScaleAdd: float=97.000000   f8_t=96.000000    e=104.000000
ConvScaleAdd: float=65.000000   f8_t=64.000000    e=72.000000
```

* Implemented ConvScaleAdd + Example.

* Add ConvScale+Bias Instances

* Add Client Example for ConvScale+Bias

* Fix number of bytes in an example..

* Cleanup.
2024-07-24 15:49:55 -05:00
Bartłomiej Kocot
ffabd70a15 Add support for half_t and bfloat to reduction operations (#1395)
* Add support for half_t and bfloat to reduction operations

* Fix bhalf convert

* Next fix bf16
2024-07-24 12:12:37 -05:00
dependabot[bot]
33b2a2bdf5 Bump rocm-docs-core from 1.5.0 to 1.5.1 in /docs/sphinx (#1414)
Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core) from 1.5.0 to 1.5.1.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.5.0...v1.5.1)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-24 07:10:50 -07:00
PoYen, Chen
4280a07d2a Refine pipeline padding settings 2024-07-24 11:37:56 +00:00
PoYen, Chen
f053ae2b5b Add missing init code 2024-07-24 07:12:06 +00:00
PoYen, Chen
bd28e96425 Remove no-longer used method in pipeline 2024-07-24 06:59:45 +00:00
PoYen, Chen
c50c36a07f Re-arrange the 'set +x' command 2024-07-24 06:56:53 +00:00
PoYen, Chen
8fb015b83f Remove more debug statements 2024-07-24 06:48:40 +00:00
PoYen, Chen
5c733dc568 Remove debug statements 2024-07-24 06:47:52 +00:00
PoYen, Chen
2126d4d88d Add append-kv smoke tests 2024-07-24 06:35:53 +00:00
PoYen, Chen
f7fb3fafaa Allow only apply RoPE on Q (without append KV) 2024-07-24 06:26:00 +00:00
PoYen, Chen
08b4e8a125 Fix wrong rope key for fp8 pipeline 2024-07-24 06:06:07 +00:00
PoYen, Chen
d84c915549 Disable host verification if API not exist 2024-07-24 06:02:41 +00:00
PoYen, Chen
8a73d334b8 Rename utility function 2024-07-24 05:19:05 +00:00
PoYen, Chen
d59e098ec4 Fix wrong pipeline 2024-07-24 05:17:57 +00:00
PoYen, Chen
29c9b650b5 Align commit message to the real comment 2024-07-24 05:14:00 +00:00
PoYen, Chen
c7b7b44883 Add comment for why I just 't' for all padding flags 2024-07-24 05:13:16 +00:00
PoYen, Chen
59e1d9b84f Shift rotary_cos/rotary_sin by cache_seqlen_k 2024-07-24 05:06:47 +00:00
PoYen, Chen
a4da1e7f22 Remove RoPEComputeDataType type alias 2024-07-24 04:45:28 +00:00
PoYen, Chen
251f8cfea9 Merge branch 'develop' into feature/fmha-fwd-appendkv 2024-07-24 04:16:35 +00:00
PoYen, Chen
3348131699 Fix wrong data type for Q rotary_cos/rotary_sin 2024-07-24 04:10:43 +00:00
PoYen, Chen
5ea60715ea Update host/device specifiers 2024-07-24 03:45:19 +00:00
PoYen, Chen
6f95239229 Use different rotary_cos/rotary_sin distr for Q/Knew 2024-07-24 03:40:29 +00:00
PoYen, Chen
47a74f282d Extract Q/Knew vector size to helper methods 2024-07-24 03:23:18 +00:00
PoYen, Chen
eb4ea3ac2a Fix wrong rotary_cos/rotary_sin memory size for Q 2024-07-23 16:22:25 +00:00
Haocong WANG
d22713a719 disable bad instance (#1410) 2024-07-23 09:05:03 -07:00
PoYen, Chen
85bac93951 Fix wrong index into knew_host/vnew_host 2024-07-23 15:31:15 +00:00
PoYen, Chen
b11f92dc4c Fix wrong shape of knew_host/vnew_host 2024-07-23 14:52:42 +00:00
PoYen, Chen
ca4b208b60 Fix wrong grid size 2024-07-23 14:20:52 +00:00
PoYen, Chen
52b47810bb Rename more tile size constants 2024-07-23 09:30:05 +00:00
PoYen, Chen
99c1d463de Align naming of some tile size constants 2024-07-23 09:24:38 +00:00
PoYen, Chen
ce5e0f1d67 Re-order parameters 2024-07-23 09:02:41 +00:00
PoYen, Chen
fb80c7b2cb Extract rotary embedding logic out 2024-07-23 08:51:59 +00:00
PoYen, Chen
2192bbc68a Rename RotaryEmbeddingEnum 2024-07-23 07:50:50 +00:00
PoYen, Chen
d4606cf3c3 Rename header 2024-07-23 07:45:25 +00:00
PoYen, Chen
b275732128 Remove always true static_assert() 2024-07-23 07:25:50 +00:00
PoYen, Chen
eb649a2f25 Move thread locating logics into policy 2024-07-23 07:21:20 +00:00
PoYen, Chen
0e5cb6f913 Skip code if # of block is more than needed 2024-07-23 06:53:24 +00:00
PoYen, Chen
7124f3eda5 Add make_tile_window() for adding distribution only 2024-07-23 06:52:38 +00:00
PoYen, Chen
0925c0e941 Use better naming for tile indices 2024-07-23 06:40:53 +00:00
PoYen, Chen
bc7c7ee0c5 Fix wrong knew/vnew appending positions 2024-07-23 04:46:53 +00:00
PoYen, Chen
56df4d6397 Remove debug print code in kernel 2024-07-23 04:01:55 +00:00
PoYen, Chen
48c70720b5 Apply RoPE to q_tile 2024-07-23 03:54:11 +00:00