zjing14
|
ffaff83a2f
|
3d grouped conv fwd with input/output fp16 and comp fp8 (#931)
* add f8 comp instance
* fixed
* fixed comments
* rename
* fixed dtype
* format
* fixed CI
* fixed ci
* add missing ComputeType
* fixed cit
* fixed
* Update cmake-ck-dev.sh
---------
Co-authored-by: Jing Zhang <jizha@amd.com>
[ROCm/composable_kernel commit: e921e1f08d]
|
2023-10-03 20:04:26 -05:00 |
|
zjing14
|
973fc655fd
|
Fixed Weight layout of grouped_conv 3d fwd (#743)
* Changed wei layout
* changed layout for examples
* fixed client example
---------
Co-authored-by: root <root@ctr-ubbsmc15.amd.com>
[ROCm/composable_kernel commit: 309b1c6461]
|
2023-06-15 10:19:33 -05:00 |
|
Adam Osewski
|
85acf7ac2f
|
Conv3D FWD BWD WRW fp16 fp32 client examples (#559)
* Conv3d bwd weight client example.
* Update year in license
* Convolution bwd data 3D fp16/fp32 client example.
* Client example for convnd fwd fp16 fp32
* clang-format
* Review remarks.
* Fix compiler err.
* Update data layout to standard one.
* Add conv 3d fwd NDHWGC instances
* clang-format
* Conv3d fwd NDHWGC instances.
---------
Co-authored-by: Adam Osewski <aosewski@amd.com>
Co-authored-by: zjing14 <zhangjing14@gmail.com>
[ROCm/composable_kernel commit: e9fd122889]
|
2023-02-15 11:16:47 -06:00 |
|