mirror of
https://github.com/ROCm/composable_kernel.git
synced 2026-05-12 01:10:17 +00:00
* Squashed 'src/composable_kernel/' content from commitf6edda611git-subtree-dir: src/composable_kernel git-subtree-split:f6edda6119* add solver ConvIgemmFwdV6r1DlopsNchwKcyxNkhw; rename static ck source files * Squashed 'src/composable_kernel/' changes from f6edda611..5781adf5c5781adf5cUpdate develop (#5) (#6)97e6d514fMerge pull request #4 from ROCmSoftwarePlatform/separate_online_compile7b1ec41e5refactor49c33aaearefactor54b3e73d1rename git-subtree-dir: src/composable_kernel git-subtree-split:5781adf5cf* fix * refactor * remove online compilation from CK * refactor * fix * add ctest * tidy * add tidy * tidy * tidy * tidy * tidy * tidy * tidy * tidy * tidy * tidy * add c-style pointer cast * vector/scalar pointer cast use c-style pointer cast instead of reinterpret_cast * fix clang warning suppression * tidy * suppress cppcheck * fix enum issue * revert chagnes to hip build * fix kernel filename * update CK build script * rename * rename * make innner product compatiable on gfx900 * Update src/include/miopen/solver/ck_utility_common.hpp Co-authored-by: JD <Jehandad.Khan@amd.com> * compiler parameter use stream * use int instead of index_t in kernel wrapper * DynamicBuffer, StaticBuffer, amd_buffer_load support customized value for invalid element * refactor * refactor * change cmakelist * change ck common utility * fix * Squashed 'src/composable_kernel/' changes from 5781adf5c..31b40352631b403526Merge pull request #16 from ROCmSoftwarePlatform/developb62bf8c3fMerge pull request #14 from ROCmSoftwarePlatform/miopen_downstream_init_integrationccc4a1d36Merge pull request #8 from ROCmSoftwarePlatform/miopen_downstream_init_integration67ad47e7crefactor16effa767refactora91b68dfcDynamicBuffer, StaticBuffer, amd_buffer_load support customized value for invalid element2cbabbba5use int instead of index_t in kernel wrapper0834bc763compiler parameter use streamf2ac7832cmake innner product compatiable on gfx9004e57b30a6renamec03045ce2renameb2589957fupdate CK build script2c48039d0fix kernel filenamed626dccc9fix enum issue643ebd4f3tidyddd49ec9efix clang warning suppression4f566c622vector/scalar pointer cast use c-style pointer cast instead of reinterpret_cast172036d72add c-style pointer cast76f313193tidyd18428901tidyf885c131dtidy80120f0a0tidyc3efeb5e2tidy56fc0842btidy54fba515btidye62bae7a4tidy24c872894add tidy61487e0a0fixae98b52adremove online compilation from CKcb9542131refactor73ca97015Merge commit '437cc595c6e206dfebb118985b5171bbc1e29eab' into composable_kernel_init_integration_v33b8664611Merge pull request #7 from ROCmSoftwarePlatform/masterd09ea4f4eUpdate develop (#5)3d32ae940add solver ConvIgemmFwdV6r1DlopsNchwKcyxNkhw; rename static ck source files git-subtree-dir: src/composable_kernel git-subtree-split:31b403526e* Tiny fix in using data type template parameters in blockwise and direct_threadwise kernel * Fix with regard to implementing GetZeroVal() in both kernel and host * Avoid convert to compType from dstDataType before writting the output value * Add half_t support to NumericLimits and make constexpr GetZeroVal() of binary operator * Add CONSTANT decorator for descriptor read buffer * Use get_thread_local_1d_id() for thread local Id * Rename GetZeroVal() to GetReductionZeroVal() in the kernels * Remove constexpr from initialized zeroVal and tiny fix in reduction_operator.hpp * Occasional tiny simplification and update in the kernel files * Update in src/reducetensor.cpp for consistent IDs passing to the kernel * Update to re-order tensor dimensions on the host, split second_call kernel wrapper files and simplify reduce_all kernel wrappers * Update to remove OpenCL tidy checking failures * Small updates in src/reducetensor.cpp * Update for better readability * Remove unused codes and not-needed template parameters in the kernel wrappers Co-authored-by: Chao Liu <chao.liu2@amd.com> Co-authored-by: JD <Jehandad.Khan@amd.com>