* start add example
* add multiple d fp16 example
* device transfer elementwiseop to gridwise
* gridwise add multiple d
* change example for multiple d
* fix spill registers
* fix for passthrough element op
* fix int8 overflow
* change example file name
* add instance for dl multiple d
* example add DsDataType
* remove grouped_convolution_forward_dl.hpp
* add head file(was deleted before)
* fix not support device issue
* format
* remove passthrough check
Co-authored-by: letaoqin <letaoqin@amd.com>
* add device of dl
* fix k1 of GridwiseGemmDl_km_kn_mn_v1r3
* init version for dl conv
* add example(init)
* result right
* disable elementwise operation
* check parameters
* add fp32,int8 example and change check code
* change deive file and class name
* add check vector access of C
* add instance
* add to ckProfiler
* add Filter1x1Pad0 instances
* fix ignore error
* fix for CI
Co-authored-by: letaoqin <letaoqin@amd.com>