Commit Graph

263 Commits

Author SHA1 Message Date
Thrupti Raj Lakshmana Gowda
8b185e872e Ck tile engine preshuffle (#2919)
* Partial Progress : Preshuffle working code for datatype

* Partial Progress : Preshuffle Cleanup

* Working code for default config with min max step

* Partial Progress : PermuteN implemented in validation

* Partial Progress : PermuteN changes in Preshuffle

* CK Tile Engine Preshuffle Complete

* CK TILE ENGINE : Preshuffle Layout validation

* CK Tile Engine Preshuffle Validation

* Preshuffle Validation check

* CK Tile Engine Preshuffle : Fixing Validation Cases

* Addressing PR review Comments

* Changes in config

* Addressing Review Comments

* Adding additional architecture in Jenkins

* Partial Progress : Selective Datatype and layouts

* Limited datatypes and layouts

* Addressing CI errors

* Datatype updates

* Datatype updates

* Datatype changes to Preshuffle

* Addressing Review Comments

* Addressing Review Comments

* Datatype changes

* Changes to Cmake

* Update on Jenkins

* Formatting with precommit

* Ruff Formatting
2025-10-27 09:15:34 -05:00
andrew clark
775b96ea6a Fixing Run CI Check for Changed Files (#3072)
* Fixing check for changed files

* Testing CI skip behavior

* Testing CI Trigger

This should skip CI

---------

Co-authored-by: Illia Silin <98187287+illsilin@users.noreply.github.com>
2025-10-24 07:52:43 -07:00
Illia Silin
d88ea05c84 disable aiter test gemm_a8w8_blockscale (#3049) 2025-10-17 19:52:22 -07:00
Illia Silin
87d0a3ac17 use branch develop to test hipTensor (#3034) 2025-10-15 15:40:34 -07:00
Illia Silin
3348f01e6f re-enable clang-format by default (#3030)
* re-enable clang-format by default

* fix clang format
2025-10-15 07:43:11 -07:00
andrew clark
0a4c45b4d3 CI Skip and Status Checks Fix (#2952)
* Update Jenkinsfile

Adding logic to skip CI checks when a commit contains changes to non-relevant files like docs, .md, licenses, and .github workflow files.

* Update Jenkinsfile

* Update Jenkinsfile

* Update Jenkinsfile

Testing skip env var

* Update Jenkinsfile

Fixing syntax

* Update Jenkinsfile

Simplifying CI check logic

* Update Jenkinsfile

Testing skipping logic on stages.

* Update Jenkinsfile

Removing post block. The status for skipped stages are already reported.

* Testing Docs

Testing modifications to files in the docs folder do not trigger a the build and test stages.

* Testing Multifile Trigger

Removed Jenkinsfile from the skip patterns. Reversed change to docs file. This test should not skip CI checks.

* Clean code

Renamed setup stage to be more descriptive.
Added pipeline env variable for consistency.
Moved performance test results stage conditional up a level so the parent stage appropriate reports the status if it is skipped.

* Fixing syntax error

* Updated CRON Flags

Added the FORCE_CI flag to the CRON instructions. This will ensure CI does not skip the job.

* Updating logging

Making logs more explicit.

* Comment update

Cleaning comments.

* Update Jenkinsfile

Reverting performance reports when condition.

* Parallel Test

Testing stage status with parallel stages

* Update Jenkinsfile

* Update Jenkinsfile

Removing stages for quick testing

* Update Jenkinsfile

* Testing skipped parallel stages

Testing the addition of a coordination stage to always pass and give an update to skipped parent stages with parallel sub-stages.

* Testing parallel stages

Adding coordination stage to test if parent check status is  correctly updated.

* Simplified performance results stage

Removed parent stage as there are no other parallel stages to execute (yet).

* Testing final clean up stage

* Testing check status update

Testing - forcing status to update after a stage skip.

* Testing results stage skip

* Removing test stage

* Testing pipeline

* Testing post status updates

* Process Test Results Post Event Update

The stage will report success when it skips or is successful.

* Testing non-relevant file change

This should skip build and test in CI

* Reverting test

updating regex file patterns to use strings instead of regex literal syntax.

* Fixing file matching regex

* Testing docs modification

* Fixing default env var value

* Correcting env var assignment

* Pipeline test

Updating docs file. Should skip ci.

* Testing Pipeline

Setting default run ci state.

* Adding debugging

* Removing debugging

* Pipeline test

Should skip pipeline

* Pipeline Test

Mixed files to trigger a CI run

* Adding additional status updates

The parent stage sometimes remains in pending even if the child stage completes when skipped. Added an additional status update for the parent stage.

* Fixing variable name

* Moving stage names

Moved the performance stage names to a single location because they are referenced multiple times. This reduces errors with typos in the future.

* Revert "Moving stage names"

This reverts commit 7cf6743e54.

* Update Jenkinsfile

Handle both truly empty arrays and arrays containing only empty strings.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-10-08 15:48:08 -06:00
Illia Silin
35e116f5c0 increase time limit for AITER tests (#2948) 2025-09-29 13:11:42 -07:00
John Afaganis
e8842e3c1f Use git ls-files to select candidate files for clang format
This change ensures that the files being selected for clang format validation are exactly the ones tracked by the git repo we are testing.  This protects against an known issue where the repo being tested contained "stray files" from a previous test.
2025-09-27 15:47:31 -06:00
Illia Silin
ec4d16b991 Enable CI on gfx1100 (#2930)
* run CI on different versions of gfx11

* do not use gfx1151 systems
2025-09-25 16:10:54 -07:00
Illia Silin
8c1a959913 use default docker for build/test on gfx950 (#2928) 2025-09-25 10:40:45 -07:00
Illia Silin
64e61b8647 Add AITER test_mha_varlen (#2927)
* add aiter test_mha_varlen

* don't fail until all aiter test run

* use the original way to run tests, just add new test
2025-09-25 10:00:20 -07:00
Jobbins
929291741d [Jenkins] Remove 'Jenkins - ' prefix (#2920)
The prefix is causing the status updates from
gitStatusWrapper to be unique to the status updates that
are created by the Jenkins server, which creates duplicates
2025-09-25 09:08:29 -06:00
Illia Silin
8fe3838c65 Upgrade to ROCm7.0.1 compiler. (#2909)
* upgrade default docker to rocm7.0.1

* turn on build and test on gfx950 by default

* use rocm-dev instead of rocm

* link libhiprtc for codegen targets

* resolving codegen compilation errors: removed calls to other std functions, resolved issues with int32_t: needed the correct header, put use of e8m0 into header guards

---------

Co-authored-by: Astha Rai <astha.rai713@gmail.com>
2025-09-24 10:00:53 -07:00
pmaybank
592d73ad73 [CK_TILE] Add support for gfx12 in tile_engine for GEMM benchmarking (#2802)
* initial work on adding support of gfx12 in tile_engine for GEMM benchmarking
* add stage("Run TILE_ENGINE_GEMM Tests on gfx1201") to Jenkins config
* make tile_[m/n/k] validation arch dependent
2025-09-17 17:59:01 +01:00
Illia Silin
b9d69d32a8 Enable FMHA and AITER tests on gfx950. (#2812)
* enable aiter and fmha test stages on gfx950

* use newer compiler for gfx950

* make sure gfx950 runs correct docker

* fix typo

* upgrade base docker for aiter

* change base docker for aiter tests

* do not add group render to ck_aiter image

* add group irc in ck_aiter docker

* do not fix the irc group id to 39

* do not set jenkins uid and gid

* skip group irc for aiter tests

* fix syntax error in dockerfile

* change the base docker for aiter tests

* add irc group back to ck_aiter docker
2025-09-12 12:20:32 -07:00
Thrupti Raj Lakshmana Gowda
f6ba94fb5c [CK TILE ENGINE] Adding GEMM Preshuffle to CK Tile Engine (#2712)
* Partial Progress : Completed ListBlob

* Additional changes in Listbob

* Partial Progress : Generate Blobs Completed

* Partial Progress : Added Host side code for Preshuffle

* Working code for Preshuffle before Cleanup

* Partial Progress : Cleanup

* Partial Progress : Datatype Validation

* Partial Progress : Warptiles for preshuffle changed from hardcoding to take from config

* Partial Progress : Cleanup

* Partial Progress : Code Cleanup

* Partial Progress : Passing all valid tiles failing for unsupported tiles

* Partial Progress : Working code, testing pending for edge cases

* Partial Progress for testing

* Completed Code

* kBlockPerCu as tunable parameter from config

* Update tile_engine/ops/gemm_preshuffle/README.md

Co-authored-by: spolifroni-amd <Sandra.Polifroni@amd.com>

* Update tile_engine/ops/gemm_preshuffle/README.md

Co-authored-by: spolifroni-amd <Sandra.Polifroni@amd.com>

* Update tile_engine/ops/gemm_preshuffle/README.md

Co-authored-by: spolifroni-amd <Sandra.Polifroni@amd.com>

* Update tile_engine/ops/gemm_preshuffle/README.md

Co-authored-by: spolifroni-amd <Sandra.Polifroni@amd.com>

* Update tile_engine/ops/gemm_preshuffle/README.md

Co-authored-by: spolifroni-amd <Sandra.Polifroni@amd.com>

* Update tile_engine/ops/gemm_preshuffle/README.md

Co-authored-by: spolifroni-amd <Sandra.Polifroni@amd.com>

* Update tile_engine/ops/gemm_preshuffle/README.md

Co-authored-by: spolifroni-amd <Sandra.Polifroni@amd.com>

* Update tile_engine/ops/gemm_preshuffle/README.md

Co-authored-by: spolifroni-amd <Sandra.Polifroni@amd.com>

* Update tile_engine/ops/gemm_preshuffle/README.md

Co-authored-by: spolifroni-amd <Sandra.Polifroni@amd.com>

* Update tile_engine/ops/gemm_preshuffle/README.md

Co-authored-by: spolifroni-amd <Sandra.Polifroni@amd.com>

* Update tile_engine/ops/gemm_preshuffle/README.md

Co-authored-by: spolifroni-amd <Sandra.Polifroni@amd.com>

* Update tile_engine/ops/gemm_preshuffle/README.md

Co-authored-by: spolifroni-amd <Sandra.Polifroni@amd.com>

* Update tile_engine/ops/gemm_preshuffle/README.md

Co-authored-by: spolifroni-amd <Sandra.Polifroni@amd.com>

* Update tile_engine/ops/gemm_preshuffle/README.md

Co-authored-by: spolifroni-amd <Sandra.Polifroni@amd.com>

* Partial Progress : Working listkernels

* Partial Progress : Cleanup Working listkernels

* Partial Progress : Single instance

* Partial Progress : Working single instance code

* Partial Progress : Working generate individual instance code

* Partial Progress : Working rewamped code for given config file needed validation and edge case testing

* Partial Progress : Working Code, testing pending

* Removing LOGS file

* Working code

* Minor changes to GEMM Preshuffle : Restructured

* Minor Changes in Preshuffle

* Changes to Jenkins File

* Changes to Jenkins file to consider new architecture

* Changes to Jenkins file for fixing CI

---------

Co-authored-by: spolifroni-amd <Sandra.Polifroni@amd.com>
2025-09-12 11:50:19 -07:00
Illia Silin
bca99a499d build and test on gfx942 by default (#2830) 2025-09-11 14:02:21 -07:00
Anton Gorenko
ec006bb8e0 [CK_TILE] Add gtests for FMHA (#2744)
* Improve random number generation

* use different seed for each input (Q, K, V...);
* use deterministic generation of:
  * seqstart_q/k (for group mode);
  * block_table (for paged-kvcahe);
  * cache_batch_idx (for kvcache);

* Extract arg_parser-related code from run functions to use them as tests

* Split examples into main programs and fmha runners, build instances separately

* Add dummy tests that use instances and runners

* Fix a missed corner case of f32->f8 conversion

When value if < min f8 denormal but > min f8 denormal / 2, it must be
rounded to min f8 denormal (i.e. 0b1), not to 0.

* Fix incorrect fp8 scales for P and O in validation code

DataTypeConfig was incorrectly compared with fp8_t.

* Add host generation of dropout random values and use it for validation

Previously host validation (reference_batched_dropout) used random
numbers generated by BlockDropout of the kernel, meaning that incorrect
generation on device (bad distribution, repeated numbers, too many zeros,
etc.) would not trigger any validation errors.

* Implement tests from smoke_test_bwd.sh

* Return result as enum to distinguish failure and missing instance

* Add tests for bwd features: bias, alibi, dropout

* Implement tests from smoke_test_fwd.sh

* Pass seqlen_q/k as vectors to fwd and bwd runners

* Add tests for fwd features: bias, alibi, dropout

* Add tests for pagedkv and splitkv

* Fix conditions when to use splitkv and pagedkv kernels

splitkv was executed only when use_kvcache which == (need_append_kvcache || use_cache_batch_idx || 0 < page_block_size).
In the SplitKV tests: the regular fwd kernel was executed if use_cache_batch_idx was not requested even when num_splitkv > 1.
In the AppendKV tests: the pagedkv kernel was executed but it often failed to find an instance.

* Add tests for appendkv

* Use is_v_rowmajor = true because there are no instances with column layout anymore

* Split public and private compile options for instances

Tests and examples need to know only about CK_TILE_FMHA_FWD_*_API.

* Improve parsing validation in bias and mask

* Pass bias as string for consistency with mask

* Catch parsing and other exceptions

* Add bwd test for deterministic flag

* Initialize fp8 tensors (-init=ufq) similarly to uf

* Fix splitkv/pagedkv invocation: use padded sk when seqlen_k_ptr is not null

seqlen_k cannot be used to determine padding when seqlen_k_ptr is
provided. The actual seqlen_k is taken from seqlen_k_ptr[b].
Even seqlen_k values (% bn0 == 0) use padded seqlen_k while seqlen_k_ptr
may contain arbitrary values.
In the example or tests this produces incorrect results with appendkv
(for example, -d=32 -s=1 -s_k=64 -s_knew=7 -vlayout=c -b=8).

* Fix use_pagedkv value when kvcache = true but page_block_size = 0

In this case block_table_ptr is nullptr which is accessed in the kernel.

* Clean up bwd tests

* Unify fwd tests for f16/bf16 and fp8

* Use better explicit instantiation declaration for fmha_bwd<2>

* Use the same seed for all tests, allow to override it with env variable

* Undo clang-format of one irrelevant file

For some reason my local clang-format-18 and the one in CI work differently.

* Do not build instances and tests on unsupported archs

* Build instance libraries as OBJECT library

* CI: Enable sccache for HIP

There are source files with LANGUAGE HIP, they need
-DCMAKE_HIP_COMPILER_LAUNCHER=sccache

* Add tests to REGRESSION_TESTS

* Fix OOB accesses in deterministic bwd due to incorrectly assumed kN0

The runner assumes kN0 = (hdim_q <= 128) ? 128 : 64 but there are
smaller tiles (for tr_load or fp32). This can create too small dq_acc_buf.

* Pass CK_TILE_FMHA_FWD_*_API as INTERFACE compile options

The instances don't actually depend on them, only examples and tests do.
Passing these definitions as INTERFACE allows to change FMHA_FWD_ENABLE_APIS
without recompiling instances that are already in ccache.

* Fix formatting and names
2025-09-10 08:06:14 +05:00
Vidyasagar Ananthan
5224d2ead3 Fixing path for tile engine tests. (#2794) 2025-09-05 17:34:48 -07:00
Illia Silin
4e4a784d53 set number of cpu threads in CI to min(nproc,64) (#2793) 2025-09-05 17:26:13 -07:00
Vidyasagar Ananthan
60ea94f4fe Fixing tile engine tests after recent refactoring. (#2791)
* Fixing tile engine tests after recent refactoring.

* Fixing line break error.
2025-09-05 14:57:59 -07:00
Illia Silin
ef6c28e989 Fix latest AITER failure and add more AITER tests in CK CI. (#2782)
* add aiter tests and move json_dump header

* remove example/include path from cmake

* extend time for aiter and pytorch stages
2025-09-04 13:44:00 -07:00
rahjain-amd
4d041837ad Add json dump support to output details from CK/CKTile Examples. (#2551)
* Adding RapidJson Library

* Adding Json Dumps in all CK_Tile Examples

Not verified yet

* Adding json to cktile Batched Transpose

* adding json dumps to layernorm2d_fwd

* Adding  json dump to flatmm_basic

* Adding RapidJson Library

* Adding Json Dumps in all CK_Tile Examples

Not verified yet

* Adding json to cktile Batched Transpose

* adding json dumps to layernorm2d_fwd

* Adding  json dump to flatmm_basic

* Adding json in 03_gemm

* Add json dump to 16_batched_gemm

* Add json dump to gemm_multi_d_fp16

* Add json dump to grouped_gemm

* fix fmha_bwd/fwd

* Fix clang-format errors

exclude include/rapidjson in jenkins as its a third-party library

* Saparating function and defination.

* Update Documentation of 03_gemm

* Refactoring as per code review

* Disable fp8 instances on unsupported targets (#2592)

* Restrict building of gemm_universal_preshuffle_f8 instances to specific targets in CMakeLists.txt

* Add condition to skip gemm_xdl_universal_preshuffle_f8 instances for unsupported targets in CMakeLists.txt

* Add conditions to skip unsupported targets for gemm_universal_preshuffle_f8 and gemm_xdl_universal_preshuffle_f8 instances in CMakeLists.txt

* Refine conditions to exclude gemm_universal_preshuffle_f8 instances for unsupported targets in CMakeLists.txt

---------

Co-authored-by: AviralGoelAMD <aviralgoel@amd.com>

* fix clang format

* remove duplicate lines of code from library/src/tensor_operation_instance/gpu/CMakeLists.txt

* Fixing Readme and unifying jsondumps

* adding moe_smoothquant

* adding fused_moe

* Fixing Readme for batched_gemm

* Fixing Readme for grouped_gemm

* adding flatmm

* adding gemm_multi_d_fp16

* adding elementwise

* adding File name when json is dumped

* Fixing Reduce after merge

* adding batched_transpose

* Adding Warptile in Gemm

* Fixing Clang Format

---------

Co-authored-by: Aviral Goel <aviral.goel@amd.com>
Co-authored-by: AviralGoelAMD <aviralgoel@amd.com>
Co-authored-by: illsilin_amdeng <Illia.Silin@amd.com>
2025-09-02 23:31:29 -07:00
Illia Silin
0ac908fb57 Add a daily CI cron job to build pytorch. (#2755)
* add a stage to builf pytorch

* add docker file for pytorch stage

* call build scripts fro mthe default path

* add a daily chron build for pytorcn stage
2025-08-27 16:57:43 -07:00
JH-Leon-KIM-AMD
19d5327c45 Test comprehensive dataset (#2685)
* Add CSV-driven convolution test pipeline

- Add test_grouped_convnd_fwd_dataset_xdl.cpp with CSV reader functionality
- Add complete dataset generation toolchain in test_data/
- Add Jenkins integration with RUN_CONV_COMPREHENSIVE_DATASET parameter
- Ready for comprehensive convolution testing with scalable datasets

* Update convolution test dataset generation pipeline

* add 2d, 3d dataset csv files

* Remove CSV test dataset files from repository

* Update generate_test_dataset.sh

* Fix channel division for MIOpen to CK conversion

* Remove unnecessary test files

* Fix clang-format-18 formatting issues

* TEST: Enable comprehensive dataset tests by default

* Fix test_data path in Jenkins - build runs from build directory

* Add Python dependencies and debug output for CSV generation

* Remove Python package installation - not needed

* Add better debugging for generate_test_dataset.sh execution

* Fix Jenkinsfile syntax error - escape dollar signs

* Add PyTorch to Docker image for convolution test dataset generation

- Install PyTorch CPU version for lightweight model execution
- Fixes Jenkins CI failures where CSV files were empty due to missing PyTorch
- Model generation scripts require PyTorch to extract convolution parameters

* Add debugging to understand Jenkins directory structure and CSV file status

- Print current working directory
- List CSV files in test_data directory
- Show line counts of CSV files
- Will help diagnose why tests fail in Jenkins

* Fix clang-format-18 formatting issues

- Applied clang-format-18 to test file
- Fixed brace placement and whitespace issues

* Add detailed debugging for CSV dataset investigation

- Check generated_datasets directory contents
- List all CSV files with line counts
- Show first 5 lines of main CSV file
- Applied clang-format-18 formatting
- This will help identify why CSV files are empty in Jenkins

* keep testing add pytorch installation in shell script

* Use virtual environment for PyTorch installation

- Jenkins user doesn't have permission to write to /.local
- Create virtual environment in current directory (./pytorch_venv)
- Install PyTorch in virtual environment to avoid permission issues
- Use PYTHON_CMD variable to run all Python scripts with correct interpreter
- Virtual environment will be reused if it already exists

* Remove debug code and reduce verbose logging in Jenkins

- Remove bash -x and debug commands from Jenkinsfile execute_args
- Remove all debug system() calls and getcwd from C++ test file
- Remove unistd.h include that was only needed for getcwd
- Remove debug print in CSV parser
- Add set +x to generate_test_dataset.sh to disable command echo
- Redirect Python script stdout to /dev/null for cleaner output

This makes Jenkins logs much cleaner while still showing progress messages.

* install gpu torch

* Clean up and optimize comprehensive dataset test pipeline

- Reorder Jenkinsfile execution: build -> generate data -> run test
- Remove commented-out debug code from generate_test_dataset.sh
- Ensure all files end with proper newline character (POSIX compliance)
- Keep useful status messages while removing development debug prints
- Set MAX_ITERATIONS=0 for unlimited test generation in production

* Add configuration modes to reduce test execution time

- Add --mode option (half/full) to generate_model_configs.py
  - half mode (default): ~278 configs (224 2D + 54 3D) -> ~1,058 total tests
  - full mode: ~807 configs (672 2D + 135 3D) -> ~3,093 total tests
- Update generate_test_dataset.sh to use CONFIG_MODE environment variable
- Keeps all model types but reduces parameter combinations intelligently
- Fixes Jenkins timeout issue (was running 3,669 tests taking 17+ hours)
- Default half mode should complete in ~4-5 hours instead of 17+ hours

* Add small mode for quick testing of comprehensive dataset

* jenkins pipeline test done

* jenkins test done

* Trigger CI build

* remove test comment and update data generation option as half

---------

Co-authored-by: Bartłomiej Kocot <barkocot@amd.com>
2025-08-26 22:18:05 +02:00
John Shumway
99d27aca17 Add a CMake property for c++ standard (17 or 20) (#2736)
Configure C++ standard with a CMake variable.

Defaults to C++20, but can be set to C++17  to test backwards compatibility.

* Add validation for allowed C++ standards.

* build CK in rehl8 docker with std=c++17

---------

Co-authored-by: illsilin_amdeng <Illia.Silin@amd.com>
2025-08-25 18:56:58 -07:00
Aviral Goel
bb6132116f build!: Update composable kernel version to 1.2.0 for rocm 7.0 release (#2734)
* build!: Update composable kernel version to 1.2.0 for rocm 7.0 release
2025-08-25 13:48:51 -04:00
Illia Silin
6180685688 Resolve issues with performance logs in CI. (#2733)
* update the performance test logic

* fix unstash perf logs logic

* untangle unstashing fmha logs for different archs

* run process stage after running fmha tests

* fix the processing of perf logs

* fix arguments for run_performance scripts
2025-08-25 09:51:29 -07:00
Illia Silin
8b55afcd93 Build ckProfiler package for all architectures. (#2701)
* stash ckprofiler package built for all targets

* build the lib for all instances in newer docker

* make sure packages get posted
2025-08-18 11:16:25 -07:00
Tianyuan Wu
68134b60e4 [CK_TILE] CK_TILE GEMM WMMA Support for GFX11/GFX12 (#2466)
* WMMA GEMM F16 Implementation

Signed-off-by: root <tianyuwu@amd.com>

* Self-review

Signed-off-by: root <tianyuwu@amd.com>

* ASIC check minor tweak

Signed-off-by: root <tianyuwu@amd.com>

* add missing include file

* Set GPU_TARGETS to gfx11/12 generic

Signed-off-by: root <tianyuwu@amd.com>

* INT8 GFX12

Signed-off-by: root <tianyuwu@amd.com>

* add int8x16 branch

* Fix CI script

Signed-off-by: root <tianyuwu@amd.com>

* Fix typo

Signed-off-by: root <tianyuwu@amd.com>

* Add CK_Tile WMMA example

Signed-off-by: Tianyuan Wu <tianyuwu@amd.com>

* Fix CI

Signed-off-by: Tianyuan Wu <tianyuwu@amd.com>

* fix clang format

* Set M/N_Warp Back to Constant

Signed-off-by: Tianyuan Wu <tianyuwu@amd.com>

* Use GemmConfigComputeV3 by default

Signed-off-by: TianyuanWu <Tianyuan.Wu@amd.com>

* Enable CK_TILE_USE_AMD_BUFFER_ATOMIC_ADD_FLOAT for gfx12

Signed-off-by: TianyuanWu <Tianyuan.Wu@amd.com>

* Remove CK_Tile wmma gemm examples from the CI list

Signed-off-by: TianyuanWu <Tianyuan.Wu@amd.com>

* Add atomic add fallback method for gfx11

Signed-off-by: TianyuanWu <Tianyuan.Wu@amd.com>

* Fix typo

Signed-off-by: TianyuanWu <Tianyuan.Wu@amd.com>

* Omit copyright year

Signed-off-by: TianyuanWu <Tianyuan.Wu@amd.com>

* Support non-square cases

Signed-off-by: TianyuanWu <Tianyuan.Wu@amd.com>

* Fix CI

Signed-off-by: TianyuanWu <Tianyuan.Wu@amd.com>

* Add get_device_ip()

Signed-off-by: TianyuanWu <Tianyuan.Wu@amd.com>

* Revert "Add atomic add fallback method for gfx11"

This reverts commit 07a79e797d.

Signed-off-by: Tianyuan Wu <Tianyuan.Wu@amd.com>

* Revert "Enable CK_TILE_USE_AMD_BUFFER_ATOMIC_ADD_FLOAT for gfx12"

This reverts commit ceee918007.

* Revise method name and typos

Signed-off-by: Tianyuan Wu <Tianyuan.Wu@amd.com>

* clang-format

Signed-off-by: TianyuanWu <Tianyuan.Wu@amd.com>

* Try fix CI

Signed-off-by: TianyuanWu <Tianyuan.Wu@amd.com>

* Revert "Try fix CI"

This reverts commit 7a7241085e.

* clang-format

Signed-off-by: TianyuanWu <Tianyuan.Wu@amd.com>

* Fix typo caused by merge

Signed-off-by: Tianyuan Wu <Tianyuan.Wu@amd.com>

* Fix typo caused by merging

Signed-off-by: Tianyuan Wu <Tianyuan.Wu@amd.com>

---------

Signed-off-by: root <tianyuwu@amd.com>
Signed-off-by: Tianyuan Wu <tianyuwu@amd.com>
Signed-off-by: TianyuanWu <Tianyuan.Wu@amd.com>
Signed-off-by: Tianyuan Wu <Tianyuan.Wu@amd.com>
Co-authored-by: joye <joye@amd.com>
Co-authored-by: Illia Silin <98187287+illsilin@users.noreply.github.com>
Co-authored-by: illsilin_amdeng <Illia.Silin@amd.com>
2025-08-15 16:22:27 -07:00
Thrupti Raj Lakshmana Gowda
1c2078066b Variable name correction in Jenkins file (#2686) 2025-08-14 13:35:55 -07:00
JH-Leon-KIM-AMD
b963478759 CSV-driven convolution test pipeline (#2581)
* Add CSV-driven convolution test pipeline

- Add test_grouped_convnd_fwd_dataset_xdl.cpp with CSV reader functionality
- Add complete dataset generation toolchain in test_data/
- Add Jenkins integration with RUN_CONV_COMPREHENSIVE_DATASET parameter
- Ready for comprehensive convolution testing with scalable datasets

* Update convolution test dataset generation pipeline

* add 2d, 3d dataset csv files

* Remove CSV test dataset files from repository

* Update generate_test_dataset.sh

* Fix channel division for MIOpen to CK conversion

* Remove unnecessary test files

* Fix clang-format-18 formatting issues

---------

Co-authored-by: Bartłomiej Kocot <barkocot@amd.com>
2025-08-13 16:24:34 +02:00
Thrupti Raj Lakshmana Gowda
3f57ec3d2d GEMM Multi D for CK Tile Engine (#2660)
* Readme for GEMM Multi D

* GEMM Multi D partial Progress

* GEMM Multi D partial Progress!

* CK Tile Engine GEMM Multi D : All Python files generated

* Partial Progress

* Partial Progress

* Partial Progress

* Partial Progress : Incorrect Result

* Partial Progress : Debugging

* Partial Progress : Correct Results

* Partial Progress - Incorrect Results

* Partial Progress - Commenting Passthrough bypass logic

* Changing Passthrough to MultiplyMultiply

* Correct Results!

* Fix and debug the pass through feature

* Sample commit

* Correct Results : MultiplyMultiply

* Code Cleanup

* Removing Failed Instances

* Working code before Unary element support

* Custom Elementwise Function support and working implementation for Mul and Add

* Updating README

* Working for Passthrough

* Review Comments : Minor Fixes

* Review Comments : Minor Fixes

* Readme Updated

* Partial Changes after Rebase

* Working Code : Changes after Rebase

* Updating Jenkins file

* Removing default value changed while testing

* Configuration changes in config files

* Tile Handler changes in GEMM Multi D Tile Engine

* Tile Handler changes in GEMM Multi D Example

* Change log for Gemm Multi D in CK Tile Engine

* Configuration changes in config files

---------

Co-authored-by: ThomasNing <thomasning@amd.com>
2025-08-12 16:05:05 -07:00
Illia Silin
bbf41b27f2 fix builds with mainline/staging compilers (#2674) 2025-08-12 10:23:08 -07:00
Illia Silin
6bfef63414 enable aiter test_mha in daily CI (#2659) 2025-08-11 09:50:33 -07:00
Illia Silin
8613aa1e40 remove ck_tile transpose and gemm stages from CI (#2646) 2025-08-08 10:48:44 -07:00
Illia Silin
7ac850ac72 Add daily AITER tests on gfx942. (#2639)
* add option to select aiter branch, add tests on gfx942
2025-08-08 09:30:46 -07:00
Illia Silin
833ae1d051 Revert "Reduce build time tile engine (#2579)" (#2623)
This reverts commit e5b79b26fa.
2025-08-05 09:27:55 -07:00
Thomas Ning
e5b79b26fa Reduce build time tile engine (#2579)
* Modify CMakeLists to allow for splitting.

* Modify CMakeLists for data and layout logic.

* Run tests and get build artifact.

* Test new Cmakelists for speedup.

* Further improvements for speedup.

* turn off the FMHA

* turn off the automatic tile engine gemm

* minor fix

* disable the transpose test first

* Address the comment

* Jenkinsfile

* change the make thread to 64

* change the compile thread to 32

* Try to use with less OS memory space

* Have the Unity build batch size to 2

* reduce the chunk size

---------

Co-authored-by: Vidyasagar Ananthan <vidyasagar.ananthan@amd.com>
2025-08-01 14:42:33 -07:00
Illia Silin
e6104daecc Add a daily CI stage to test AITER with latest CK. (#2598)
* add a CI stage for AITER testing
2025-08-01 07:55:51 -07:00
Bartłomiej Kocot
5b244105d9 Enable multiple D for grouped conv fwd large tensors (#2572) 2025-07-28 22:39:07 +02:00
Illia Silin
504b101da3 upgrade from clang-format-12 to clang-format-18 (#2568)
* upgrade to clang-format-18

* update to clang-format-18 in pre-commit-config
2025-07-28 11:34:07 -07:00
Illia Silin
9786087010 use ninja to build packages (#2575) 2025-07-28 11:04:12 -07:00
Illia Silin
ead17e6265 disable building CI for gfx942 by default (#2529) 2025-07-18 12:25:24 -07:00
Thrupti Raj Lakshmana Gowda
0f3083ab5c [CKTILE] Layout Support for CK Tile engine (#2482)
* Updating runtime log message for CK TILE ENGINE

* CKTile layout from config

* CKTile custom config for CI

* Documentation for Layout Changes

* CKTile Layout changes  to Jenkins

* Fixing Clang Format

* Changes to Jenkins file to fix error

* fix(cmake-ck-dev): no longer sets invalid values as gpu arch

* style(py files): ruff formatting

* fix(cmake-ck-release): no longer sets invalid values as gpu arch

* chore(cmake-tile_engine): add reminder to uncomment user config json

* Changes to jenkin file to address more cases

* Changes to Jenkins to fix Error

* Changes to Jenkins file for fixing an error

* Update Jenkinsfile (#2517)

* Update Jenkinsfile

---------

Co-authored-by: ThruptiRajLakshmanaGowda <tlakshma@amd.com>
Co-authored-by: AviralGoelAMD <aviral.goel@amd.com>
Co-authored-by: Thomas Ning <Thomas.Ning@amd.com>
2025-07-17 12:19:41 -07:00
Illia Silin
f5d1e3fa48 Use a clang20 compiler for gfx950 builds. (#2504)
* update docker tag for gfx950 ci build

* update compiler path for gfx950 ci build

* suppress compiler path override for gfx950

* clean up
2025-07-16 07:37:53 -07:00
Vidyasagar Ananthan
e391b025a0 New ninja tracing script (#2472)
* Adding ninja log json convertion utility

* Updating to match old ninjatracing

* Updating Jenkins to use new ninjatracing

* Ensuring v7 works

* Removing old ninjatracing from dockerfile
2025-07-08 22:36:50 -07:00
Vidyasagar Ananthan
33d704a6f9 Separating ninja build tracing and setting flag to false (#2470)
* Separating ninja build tracing and setting flag to false

* Add ftime-tracing flag

* Fix conditional issue

* Try adding a script block

* Embed Clang analysis in ftime trace block
2025-07-08 10:52:00 -07:00
Vidyasagar Ananthan
d2536b91bc Remove ftime tracing to avoid printing json files (#2452)
* Remove ftime tracing to avoid printing json files

* Factoring out build commands
2025-07-03 07:54:12 -07:00
Vidyasagar Ananthan
2fa9270a25 Fix an earlier static check error due to assignment of variable in Jenkinsfile (#2420)
* Testing assignment of param fix

* Removing redundant changes

* Adding back unit test runs

* Ensuring Jenkins changes work on develop - to be reverted

* Revert "Ensuring Jenkins changes work on develop - to be reverted"

This reverts commit cf1cab4a43.
2025-06-28 07:07:14 -07:00