* [CK] Add command option instance_index and param_mask to run partial ck test
Many CK test are instance test. it will loop all instance in the instance library. It causes test often out-of-time if we run test on simulator/emulator.
This PR add option instance_index and param_mask to reduce the workload of instance test
instance_index: only run test 1 available instance with specified index.
param_mask: filter the embedded parameter with specified mask
* fix CI error
* fix clang format
---------
Co-authored-by: illsilin_amdeng <Illia.Silin@amd.com>
* add multi embeddings support
* fix format
* optimize sqrt
* add reduce operation
* change to elementwise op
* fix name
* rename
* run ci cd
* format example
* format code
* format code
* Rangify STL algorithms
This commit adapts rangified std::copy(), std::fill() & std::transform()
* Rangify check_err()
By rangifying check_err(), we can not only compare values between
std::vector<>s, but also compare any ranges which have same value
type.
* Allow constructing Tensor<> like a HostTensorDescriptor
* Simplify Tensor<> object construction logics
* Remove more unnecessary 'HostTensorDescriptor' objects
* Re-format example code
* Re-write more HostTensorDescriptor ctor call
* Move kernel implementation files under impl directory.
* Update examples paths.
* Update device kernel impl include paths.
* Update tensor operation instances include paths.
* Update profiler and tests include paths.
* Clang-format
* Update include paths for batched gemm reduce
* Refactor UnitTest ConvNDBwdWeight.
* Refactor fwd and bwd data convND UT.
* Fix used test macro.
* Fix include path.
* Fix include paths.
* Fix include paths in profiler and tests.
* Fix include paths.
Co-authored-by: Adam Osewski <aosewski@amd.com>