Files
composable_kernel/include/ck/tensor_operation/gpu/device/impl
Bartłomiej Kocot 4ec5c52a0c Add Grouped Conv Fwd Large Tensor kernel (#1432)
* Support 64 bit indexing

* Add new grouped conv fwd kernel for large tensors

* Add instances large tensor

* Fixes for transform conv to gemm

* Fixes

* fixes

* Remove not needed instances

* examples fixes

* Remove not need ds arrays

* Fix tests

* Add 2GB check in gridwise dl

* Fixes
2024-08-06 10:06:10 +02:00
..
2024-05-10 09:41:39 -07:00
2023-06-19 09:44:22 -05:00