* Introduce int4 data type.
* Add unit-tests for int4
* Compile int4 UT only when int4 enabled.
* clang-format
Co-authored-by: Adam Osewski <aosewski@amd.com>
* allow selecting compiler version
* fix typo
* add Wno-deprecated flag for google tests
* change git repo, fix qa log files names
* change the git clone syntax
* use Omkar's git credentials
* try to use jenkins as git user
* try using illsilin username for gerrit repo with ssh key
* try new gerrit authorization
* change ssh key syntax
* try another way of passing ssh key to docker
* add mount ssh in dockerfile
* create .ssh folder
* move ssh-keyscan to later
* get rid of npm call
* build first docker image on master
* check the contents of the .ssh folder
* try replacing omkars creds with gerrit creds
* use open repo, clean up changes
* get rid of ssh default argument
* Switch to standard ROCm packaging
* Revert .gitignore changes
* install new rocm-cmake version
* update readme
Co-authored-by: illsilin <Illia.Silin@amd.com>
Co-authored-by: Chao Liu <chao.liu2@amd.com>
* Turning compare warnings on
* Cleaning part I
* Cleaning part II
* Explicit static_cast to ck::type_convert
* Resolving large tensor size issue.
* format
* revert change to tensor descriptor; promote lementSpaceSize to 64bit
* use integer value for GEMM test
* Review remarks
* Review remarks + issues with (un)signed arithmetic
* Format fix
* Format
* Clang-format.
* fix 2gb limit issue
Co-authored-by: Chao Liu <chao.liu2@amd.com>
Co-authored-by: Adam Osewski <aosewski@amd.com>
* Use googletest for tests. Add conv2d_fwd UT.
* Add conv1D/3D to gtest UT.
* Fix: not duplicate test with CTest.
* Convert more tests to googltests.
* Fix: GIT_SHALLOW is not allowed for git commit hash.
* Clang-format
* use integer value for GEMM test
Co-authored-by: Adam Osewski <aosewski@amd.com>
Co-authored-by: Chao Liu <chao.liu2@amd.com>
Co-authored-by: Chao Liu <lc.roy86@gmail.com>
* Add online-compiling facility
* Synchronize from fwd-v4r5 and implement host interfaces to call conv-fwd v4r4/v4r5 using on-line compiling method
* Tiny adjustment to time reporting
* Use object assignment to replace explicit bytes copying in the first kernel of v4r4/v4r5
* Use single thread to assign descriptor object to device memory
* Adjust to the workload assignment of the two kernels of v4r4 (experimental)
* Revert "Adjust to the workload assignment of the two kernels of v4r4 (experimental)"
This reverts commit eb38461456bb0c82b6c0d32cdd616e181907e20c.
* Update to make constexpr for generating descriptor types in kernel 2 of dynamic conv-fwd v4r4
* Update to dynamic conv-fwd v4r4 online-compiling
* Update to dynamic conv-fwd v4r5 online-compiling (result not accurate)
* Tiny update to driver/CMakeLists.txt
* clang-format
* Tiny comments change
* Add env OLC_DUMP_SAVE_TMP_DIR to support saving of temperary dir
* Fwd v4r5 olc perf (#39)
* added hip-clang flags that fix perf issue of online compilation
* fix bug for olc fwd-v4r5-nchw
* Move constexpr and type reference statements out of the function body in conv-fwd v4r4/v4r5 kernel wrapper
* Remove printing in hip_build_utils.cpp
* Update to root CMakeLists.txt
* Revert "Move constexpr and type reference statements out of the function body in conv-fwd v4r4/v4r5 kernel wrapper"
This reverts commit 3d2c5d8ecdd8298b72d127110500ed5b38d9835c.
Co-authored-by: Chao Liu <chao.liu2@amd.com>
Co-authored-by: Chao Liu <lc.roy86@gmail.com>
Co-authored-by: root <root@dc-smc-18.amd.com>