Files
composable_kernel/include/ck/tensor_operation/gpu/device
rocking5566 82d7d9938f Hotfix binary elementwise (for broadcast on fastest axis) (#254)
* Support different length of ScalarPerVector

* Add example of broadcast on fastest axis

* Typo

* Refine fastest example

* Add dimension check

* Modify fastest broadcast example to 3d

* Enforce users give scalarPerVector explicitely

* 1. Add CscalarPerVedctor
2. Not only broadcast on fastest need to set scalarPerVector to 1

* Rename var

* Move IsScalarPerVectorValid() inside IsSupportedArgument()

* Separate GridDesc_M0 into A, B and C

* rename var

* Rename var of length

Co-authored-by: rocking <chunylai@amd.com>
2022-05-25 11:17:27 -05:00
..
2022-03-23 22:18:42 -05:00
2022-05-24 12:19:27 -05:00
2022-05-19 21:56:56 -05:00
2022-05-24 12:19:27 -05:00
2022-05-24 12:19:27 -05:00