* experiment with config file
* experiment with version.h config
* add more info to version.h
* minor updates
* minor updates
* fix case where DTYPE is not used
* large amount of files but minor changes
* remove white space
* minor changes to add more MACROs
* fix cmakedefine01
* fix issue with CK internal conflict
* fix define and define value
* fix clang-format
* fix formatting issue
* experiment with cmake
* clang format v12 to be consistent with miopen
* avoid clang-format for config file
* allow building CK for specific data types
* add CI build and test stage on Naiv3x without some int8 instances
* add missing gemm fp16 instances
* add the changes to the missed cmake file
* add empty lines at end of source files
* Do not build quantization client example on navi3 in CI
* disable batched_gemm_multi_d_int8 instances with DTYPES
* disable device_conv2d_bwd_data_instance with DTYPES
* fix ckprofiler for conv_bwd_data for int8
* properly isolate the conv_bwd_data int8 instances
* remove empty line
* init commit of convnd bwd data
* begin compiling example
* have a first version that produce a right result
* refine device level launch kernel code
* add more instances in example and get right results
* clang-format
* format example file
* add more instances
* fix instances
* adding conv_bwd_data multile_d
* adding conv_bwd_data multile_d
* adding conv_bwd multiple d
* adding conv_bwd multiple d
* adding conv_bwd multiple d
* refactor
* refactor
* adding conv bwd data multiple d
* adding conv bwd data multiple d
* adding conv bwd data multiple d
* adding conv bwd data multiple d
* adding conv bwd data multiple d
* adding conv bwd data multiple d
* adding conv bwd data multiple d
* refactor
* update conv fwd's bias impl
* refactor
* reorg file
* clean up cmake
* clean
* clean
* clean
Co-authored-by: Chao Liu <lc.roy86@gmail.com>
Co-authored-by: Chao Liu <chao.liu2@amd.com>
* Extract base class for elementwise
* Refactor interface of DeviceGemmReduce. Do not use tuple in interface
* [What] Rename d into reduce in gemm + reduction related code
[Why] Prepare to add d term for add
* Unify base class of gemm + reduce and gemm + bias + add + reduce
* 1. Rename gemm_bias_add_reduce for external api
2. Refine cmake
* Add normalize device operation
* [What] Reorder the argument
[Why] Because d0 is also the input of c.
* Add type string
* Add example of gemm_bias_add_layernorm via external api
* Refactor example code
* clang-format
* Fix compile error
* clang-format
* Add external api for gemm_add_add_layernorm and normalize
* Add client example
* clang-format