* Bump commit ref for TheRock in workflows
* Update to more recent commit (could also `rm` the patch)
* Revert "Update to more recent commit (could also `rm` the patch)"
This reverts commit 4b9f4952ea.
* Rm patch that no longer applies
* Fix post_build_upload flag name
* Fix artifact_group plumbing for setup test env
[ROCm/composable_kernel commit: aa1fb29aa1]
* Extend AK1 / BK1 support:
- Add support for AK1 != BK1
- Add support for AK1, BK1 > 8
- Introduce KInner template parameter for pipelines when loading multiple tiles with one instruction
* fix clang format
[ROCm/composable_kernel commit: 1c544abf57]
* Add gtests for compiler CI for faster testing
* Add changes to have a custom target
* Add a gtest suite for gemm kernel for running CI tests with compiler mode
* Fix Clang error (EOL)
* Removed compiler subfolder from CMake
* Add gtest suite for gemm kernel
* Disable failed tests
* Fix build errors
* Resolved PR comments
* Update shape for persistent gemm kernel test
* Seperated types by H/W archs
* Made changes to persistent types
* Fix persistent build failure issue
---------
Co-authored-by: Thomas Ning <Thomas.Ning@amd.com>
[ROCm/composable_kernel commit: d5746dd120]
* Add device operation to conv signature. Use unions to hold conv layouts and device operations.
* Add predicates for all device op instances.
* Use the device op signature for validation.
* Fix ckb CMakeLists.txt file for tests.
* Fix building CK Builder instance traits after the introduction of direct load template parameter in CK.
* Fix clang-formatting.
* add device_grouped_conv_fwd_dl_multiple_d_nhwc_kyxc_nhwk
* Add full DL configurability with Option A implementation
- Added 5 DL descriptor structs (39 configurable parameters)
- Added 10 C++20 concepts for type-safe validation
- Updated factory to read all parameters from descriptors
- Updated test helper to populate all descriptors
- All tests passing (13/13 including 3 new DL tests)
* Add factory and test support for DeviceGroupedConvFwdMultipleD_Xdl_CShuffle_Large_Tensor
- Add factory specialization for Large_Tensor device operation (conv_factory.hpp lines 1145-1265)
- Add macro collision workaround using pragma push/pop (conv_factory.hpp lines 43-51)
- Add test helper function run_test_DeviceGroupedConvFwdMultipleD_Xdl_CShuffle_Large_Tensor
- Add builder test file test_ckb_conv_fwd_2d_large_tensor_fp16.cpp with 2 test cases
- Update CMakeLists.txt to include new test file
- Reuse existing ConvAlgorithm_DeviceGroupedConvFwdMultipleABD_Xdl_CShuffle descriptor
- Map all 42 template parameters identical to regular XDL CShuffle
- All 15 builder tests passing including 2 new Large_Tensor tests
Completes Task 350: All 4 forward convolution device operations now supported in CK Builder.
* Update copyright headers to new format
- Change copyright format to: Copyright (C) Advanced Micro Devices, Inc., or its affiliates.
- Reorder headers: Copyright first, then SPDX-License-Identifier
- Updated files:
* experimental/builder/test/conv/test_ckb_conv_fwd_2d_dl_fp16.cpp
* experimental/builder/test/conv/test_ckb_conv_fwd_2d_large_tensor_fp16.cpp
* experimental/builder/include/ck_tile/builder/device_op_types.hpp
* fix c++ 18 format
* Fix clang-format-18 error in device_op_types.hpp
---------
Co-authored-by: Ville Pietilä <ville.pietila@amd.com>
Co-authored-by: Ville Pietilä <188998872+vpietila-amd@users.noreply.github.com>
[ROCm/composable_kernel commit: 5f3cae3e28]
* Introduces the new partitioner to implement the reduction StreamK kernel
* Add more doc text to functions
* Add persistent-dp option to streamk example
* Update example/ck_tile/40_streamk_gemm/README.md
[ROCm/composable_kernel commit: 5abe4109e0]
* Update copyright messages.
Copyright messages should no longer include a year. This PR updates all 38 source files to the new format.
* Switch to (C) from unicode copyright symbol.
The unicodein comments was causing compilation errors.
[ROCm/composable_kernel commit: 0be0288f58]
* Add backward weight instance traits for xdl cshuffle.
To keep instance test file sizes reasonable, we start a new test_bwd_weight_instances_traits.cpp test file.
* Fix copyright notices.
* Remove (c) symbol, replace with (C).
Having UTF-8 in source caused an error with code generation.
[ROCm/composable_kernel commit: 6dbee64886]