## Motivation
Fixing some of the CK CI issues
## Technical Details
fixing paths to dockerfiles and scripts;
moving codegen tests to separate stage (collides with main build since
you must call cmake from same folder but different options);
fixing a couple of clang compilation issues with staging compiler;
## Motivation
- git diff needs access to reference repo
## Technical Details
- mount reference repo path into docker for selective_test_filter.py to
access
## Test Plan
- tested in MICI
## Test Result
- launch_tests.sh ran successfully
## Motivation
Pipelines were failing on Math CI status check.
## Technical Details
For the success case, we just changed the config in Jenkins to use a
proper app token and no code changes were required. However, the failure
case would not have worked as coded, so we needed to move that outside
of the `rocmnode()` block.
## Test Plan
I removed all of the CI in one of the commits to quickly test, and then
added it back. Got a successful "success" message and "failure" message
produced
## Motivation
- Corrects path to script due to superrepo migration
- Forces all tests to run by default
## Technical Details
- now in /projects/composablekernel
---------
Co-authored-by: illsilin_amdeng <Illia.Silin@amd.com>
## Motivation
- Maintain a reference repo on slave nodes that speeds up any
clone/checkout operations
## Technical Details
- clone a ref repo if it does not exist
- update ref repo if it does exist
- checkout after ref repo is updated
- eliminates double clone
## Test Result
- Initial checkouts succeeded
Implement per-page K/V quantization for paged attention:
- Add KV_BLOCKSCALE enum to BlockAttentionQuantScaleEnum
- Use exp2 shift trick to eliminate explicit P scaling overhead
- Prefetch physical pages offset for KV cache, overlaps with
computations
## Proposed changes
Please describe the motivation behind the pull request, whether it
enables a new feature or fixes a bug. If there are associated pull
requests or issues, please link them to the pull request.
## Checklist
Please put an `x` into the boxes that apply. You can also fill these out
after creating the PR. If you're not sure, please don't hesitate to ask.
- [ ] I have added tests relevant to the introduced functionality, and
the unit tests are passing locally
- [ ] I have added the test to REGRESSION_TESTS list defined at the top
of CMakeLists.txt in tests/CMakeLists.txt, **IF** the test takes more
than 30 seconds to run.
- [ ] I have added inline documentation which enables the maintainers
with understanding the motivation
- [ ] I have removed the stale documentation which is no longer relevant
after this pull request
- [ ] (If this change is user-facing) I have added release notes which
provide the end users with a brief summary of the improvement from this
pull request
- [ ] I have run `clang-format` on all changed files
- [ ] Any dependent changes have been merged
## Discussion
If this is a relatively large or complex change, feel free to start a
discussion by explaining why you chose the solution you did and what
alternatives you considered
---
🔁 Imported from
[ROCm/composable_kernel#3696](https://github.com/ROCm/composable_kernel/pull/3696)
🧑💻 Originally authored by @Jeff-Huang
---------
Co-authored-by: Jeff Huang <chiachi.huang@amd.com>
Co-authored-by: Illia Silin <Illia.Silin@amd.com>
## Motivation
Enable the CK CI after migration from standalone repo.
## Technical Details
Modify the jenkinsfile in projects/composablekernel to update the CI
workflow.
## Test Plan
This is for CK internal testing only.
## Test Result
Set up new CK CI pipeline/dashboard.
## Submission Checklist
- [x] Look over the contributing guidelines at
https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.
---------
Co-authored-by: Andrew Clark <andrew.clark@amd.com>
* Added two new failure patterns to detect. Including test function to verify if the patterns are detected
* Modifying pattern match to detect docker login failure. Removed passing tests.
* Removing passing tests. Modifying docker pattern to detect failure
* Removed passing tests
* Removing test logging function
[ROCm/composable_kernel commit: 421b714f13]
This change improves the clang-format CI check to be faster and not
depend on git being available in the build environment.
Changes:
- Use `find` instead of `git ls-files` (no git dependency)
- Check all C++ files: *.h, *.hpp, *.cpp, *.h.in, *.hpp.in, *.cpp.in, *.cl
- Exclude build/ and include/rapidjson directories
- Use parallel processing with 8 cores (-P 8) for ~8x speedup
- Show only errors with unified diff format (-u)
- Clear error messages: "ERROR: <file> needs formatting"
- Preserve original logic: run clang-format only when RUN_CPPCHECK=false,
or run both clang-format and cppcheck when RUN_CPPCHECK=true
Performance:
- Sequential processing: ~93 seconds for 5,899 files
- Parallel with 8 cores: ~12 seconds for 5,899 files
- Per-file processing time: ~15ms
This reduces CI time while maintaining code formatting standards.
[ROCm/composable_kernel commit: 98abfa4ade]
1. Added `-DCK_EXPERIMENTAL_BUILDER=OFF` to the `setup_args` to explicitly disable the experimental builder
2. Added a detailed comment explaining why this is necessary:
- SLES15 is a legacy platform with limited C++20 ecosystem support
- While the ROCm compiler supports C++20, the older system libraries and standard library implementation on SLES15 does not reliably support all C++20 features required by the experimental CK Builder
[ROCm/composable_kernel commit: 2d233c838a]
* memory op changes
* memory op changes
* Fixing TILE_ENGINE_BASIC in Tile Engine
* Removing gfx90a from Tile Engine Run
* [CK TILE ENGINE] increasing ci configs for BASIC case
* Setting RUN_TILE_ENGINE_BASIC_TESTS to ON by default
---------
Co-authored-by: Max Podkorytov <4273004+tenpercent@users.noreply.github.com>
[ROCm/composable_kernel commit: 51027474af]
* solve compiler issue
* solve the gfx950 mfma shuffle regression
* refactor jenkinsfile to handle arch name better
* [CK TILE] set divisor to count of thread along k dimension
* fix the compiler error
* solve degradation
* Finish the multiplies fix
* fix the scales
* solve compilation error
* solve the composes
* solve the error of tile sweeper
* fix the test and example
* fix for gfx950
---------
Co-authored-by: Max Podkorytov <4273004+tenpercent@users.noreply.github.com>
Co-authored-by: illsilin_amdeng <Illia.Silin@amd.com>
Co-authored-by: Cong Ma <congma13@amd.com>
[ROCm/composable_kernel commit: 00c46785a8]
* Removing hard-coded trace filename
* Including stage name in notification
* Simplifying capture setup and tagging file names with arch
* Removed test property from notification message
* Fixing regex to get arch name
* Fixing error in notification and modified regex
[ROCm/composable_kernel commit: e77a7ca2bc]
* generate and visualize build traces for all archs
* generate build traces in all cases
* fix jenkins logic
* fix typo
* use more threads for parsing dependency map
* add script to parse ninja traces and issue warnings
* fix python script syntax and header
* fix python syntax one more time
* fix python syntax
[ROCm/composable_kernel commit: 3dfa794fab]
* Parallelization in dataset generation
* Parallelizable tests for fwd, bwd data, bwd weight with datasets
* .gitignore generated datasets
* Test parallelization script with round-robin GPU scheduling
* Parallelization updates to test generation and running
* Dataset paths relative to executable
* Update output from test generation
* Default to one GPU in test generation
* Add small dataset tests to Jenkins
* Update copyright lines
* Update test_data/generate_test_dataset.sh
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Move trap disable
* Common get path function
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
[ROCm/composable_kernel commit: fe35ba5dac]
* [CK TILE STREAMK] Introduce initial support for tile engine in streamk GEMM.
- This commit lays the groundwork for integrating the tile engine into streamk GEMM.
It focuses on creating benchmark executables for streamk GEMM.
- Additional scripts like test_benchmark.sh and gemm_benchmark.py will be added once
the streamk implementation reaches stability.
* [CK TILE STREAMK] Enable CI to execute tile engine benchmarks for StreamK GEMM
* [CK TILE STREAMK] Refactor: Extract common utility functions.
* [CK TILE STREAMK] Revise tile engine of streamk to align with the updated implementation
* Add pre-commit
* [CK TILE STREAMK] Add 'dp_persistent' and 'reduction_strategy' in output of CK TILE STREAMK
* [CK TILE STREAMK] Fix a bug about value of 'dp_persistent' of CK TILE STREAMK
* [CK TILE STREAMK] Update Jenkinsfile
* [CK TILE Engine] Update StreamK tile engine help message
Remove default value messages as they are automatically printed
* [CK TILE Engine] Update StreamK tile engine
- Remove namespace reboot
* [CK TILE Engine] Update StreamK tile engine
- Fix merge error
[ROCm/composable_kernel commit: 30727c48fc]
* build and run ck_builder tests
* add test_ckb_all to targets
* fix syntax
* fix test path
* Update CMake targets for builder testing in CI (#3290)
Our existing CMake only had build targets. Update CMakeLists.txt to have CTEST targets:
* smoke-builder
* regression-builder
* check-builder
Co-authored-by: John Shumway <jshumway@amd.com>
* use check-builder target
* get rid of test_ckb_all target
* call ninja check-builder separately
---------
Co-authored-by: John Shumway <jshumway@amd.com>
[ROCm/composable_kernel commit: a54f7b1138]
* allow using alternative compiler in all CI stages
* get rid of some redundancies in jenkinsfile
* clean up jenkinsfile a bit more
* further clean up jenkinsfile
* do not force user jenkins in ci dockers
[ROCm/composable_kernel commit: 3e8e6f7e4f]
* Renaming old code
* Adding GEMM code with new Architecture
* Partial Progress : Errors
* Partial Progress : Working code
* Changes to element wise function
* Removing Debugging statements
* Working GEMM Multi D code
* Removing Stale Code
* Address Copilot review comments
* Address Copilot review comments
* Changes to validation file
* Changes to common code snippets
* Creating common folder
* Removing duplicate files
* Pointing to right common file
* Pointing to right common file
* Pointing to right common file
* Changing to VERBOSE
* Changing CMAKE messages to verbose
* Updating Cmake with right layout datatype configs
* Working code for GEMM Multi D
[ROCm/composable_kernel commit: a33d98f8e2]
* Adding GPU not found pattern
Also, failurePatterns does not need to be global. Moved variable to live in the failure notifications function scope.
* Testing new failure type
* Testing failure
* Removing the forced failure test
* Adding an additional failure pattern
[ROCm/composable_kernel commit: 1977e4b96a]