Enrico Degregori
|
3d29bff2f0
|
Wmma support for multiple ABD GEMM (#2803)
* multi_abd wmma support:
- Add multiple A and B support to multiple D implementation (gridwise level)
- Add multi_abd GEMM (device level)
- Add instances (xdl parity)
- Add tests (both xdl and wmma)
- Add examples
- Add ckProfiler support (both xdl and wmma)
* Fix bug in device print function
* Fix unused template parameter
* Fix batched gemm for multiABD gridwise implementation
* Fix gemm_universal_reduce with multiABDs gridwise implementation
---------
Co-authored-by: Illia Silin <98187287+illsilin@users.noreply.github.com>
|
2025-09-22 18:49:06 -07:00 |
|