mirror of
https://github.com/ROCm/composable_kernel.git
synced 2026-05-27 08:25:46 +00:00
[CK_TILE] FMHA BWD Enable Tile 16x192 (#2741)
* 16x192
* Use buffer_load_lds for lse/d
* Dispatch & cleanup
* Avoid zeroing dq & fix
* fix
[ROCm/composable_kernel commit: ead4447b20]
This commit is contained in:
@@ -803,7 +803,6 @@ bool run(const ck_tile::ArgParser& arg_parser)
|
||||
|
||||
o_buf.ToDevice(o_host.data());
|
||||
lse_buf.ToDevice(lse_host.data());
|
||||
dq_buf.SetZero();
|
||||
dbias_buf.SetZero();
|
||||
dq_acc_buf.SetZero();
|
||||
|
||||
|
||||
Reference in New Issue
Block a user