zjing14
|
1cc36ba5fb
|
Add contraction_multi_abd (#972)
* add gridwise_multi_abd
* move element_op into RunRead
* merge element_wise op with data read
* add multiABD example
* allow packed elementwise_op
* changed example
* clean
* clean
* add is_detected
* fix
* minor fix
* add scaleAdd_vec4 example
* init commit for contraction_multi_ABD
* add examples
* add examples of multiA and broadcast
* update example
* fixed comments
* Update cmake-ck-dev.sh
* Update cmake-ck-dev.sh
* Add comments into the example
* Update CMakeLists.txt
---------
Co-authored-by: Jing Zhang <jizha@amd.com>
|
2023-10-17 20:17:58 -05:00 |
|
zjing14
|
2ce9b56c64
|
add vector_type support into thread_copy_v3r1 (#969)
* add vector_type support into thread_copy_v3r1
* remove unncessary type_convert
* fixed datatype
* fixed dataType
* changed API with is_packx_invocable
* changed example
* add missing cmake file
* fixed ci
* fixed cmake
---------
Co-authored-by: Jing Zhang <jizha@amd.com>
|
2023-10-13 15:11:43 -05:00 |
|
Illia Silin
|
4daedf8ca5
|
Revert "Add support for mixed precision in contraction scale and bilinear" (#967)
* Revert "Add support for mixed precision in contraction scale and bilinear (#936)"
This reverts commit f07485060e.
* revert commits #957 and #960
|
2023-10-05 14:58:23 -07:00 |
|
Illia Silin
|
59dbb01fd1
|
get rid of gfx900/906, set rocm5.7 as default (#958)
|
2023-10-02 12:01:11 -07:00 |
|
zjing14
|
9d58c42103
|
Contraction multi abd (#957)
* add gridwise_multi_abd
* move element_op into RunRead
* merge element_wise op with data read
* add multiABD example
* allow packed elementwise_op
* changed example
* clean
* clean
* add is_detected
* fix
* minor fix
* add scaleAdd_vec4 example
* init commit for contraction_multi_ABD
* add examples
* add examples of multiA and broadcast
* update example
* fixed comments
* Update cmake-ck-dev.sh
* Update cmake-ck-dev.sh
* Add comments into the example
---------
Co-authored-by: Jing Zhang <jizha@amd.com>
|
2023-10-02 09:18:36 -05:00 |
|