mirror of
https://github.com/ROCm/composable_kernel.git
synced 2026-05-24 14:54:47 +00:00
Backward weight v4r4r2 with xdlops (#18)
* start * modify transformat * modify device convolutiion * modify host * added host conv bwd and wrw * remove bwd, seperate wrw * clean * hacall k to zero * out log * fixed * fixed * change to (out in wei) * input hack * hack to out * format * fix by comments * change wei hacks(wei transform has not merge) * fix program once issue * fix review comment * fix vector load issue * tweak Co-authored-by: ltqin <letaoqin@amd.com> Co-authored-by: Jing Zhang <jizhan@amd.com> Co-authored-by: Chao Liu <chao.liu2@amd.com>
This commit is contained in:
Reference in New Issue
Block a user