Rostyslav Geyyer
a61e73bc56
Add instances for conv_scale with fp8@bf8->fp8 ( #1220 )
...
* Update device op api to support BComputeType
* Add example
* Add instances
* Add profiler mode
* Add client example
* Update copyright year
* Add BComputeType check
* Fix compute types
2024-04-03 09:08:08 -05:00
Bartłomiej Kocot
f2398f612d
Introduce multiABD api and deprecate multiD ( #1035 )
...
* Introduce multiABD api and deprecate multiD
* Replace multiD with multiABD
* Mark structures as deprecated
* Change doxygen deprecated to note to avoid warnings
2023-11-14 17:00:40 +01:00
zjing14
e921e1f08d
3d grouped conv fwd with input/output fp16 and comp fp8 ( #931 )
...
* add f8 comp instance
* fixed
* fixed comments
* rename
* fixed dtype
* format
* fixed CI
* fixed ci
* add missing ComputeType
* fixed cit
* fixed
* Update cmake-ck-dev.sh
---------
Co-authored-by: Jing Zhang <jizha@amd.com >
2023-10-03 20:04:26 -05:00
zjing14
309b1c6461
Fixed Weight layout of grouped_conv 3d fwd ( #743 )
...
* Changed wei layout
* changed layout for examples
* fixed client example
---------
Co-authored-by: root <root@ctr-ubbsmc15.amd.com >
2023-06-15 10:19:33 -05:00
Adam Osewski
e9fd122889
Conv3D FWD BWD WRW fp16 fp32 client examples ( #559 )
...
* Conv3d bwd weight client example.
* Update year in license
* Convolution bwd data 3D fp16/fp32 client example.
* Client example for convnd fwd fp16 fp32
* clang-format
* Review remarks.
* Fix compiler err.
* Update data layout to standard one.
* Add conv 3d fwd NDHWGC instances
* clang-format
* Conv3d fwd NDHWGC instances.
---------
Co-authored-by: Adam Osewski <aosewski@amd.com >
Co-authored-by: zjing14 <zhangjing14@gmail.com >
2023-02-15 11:16:47 -06:00