composable_kernel

mirror of https://github.com/ROCm/composable_kernel.git synced 2026-05-14 10:09:41 +00:00

Files

Thrupti Raj Lakshmana Gowda 1d1d6717d2 GEMM Multi D for CK Tile Engine (#2660 )

* Readme for GEMM Multi D

* GEMM Multi D partial Progress

* GEMM Multi D partial Progress!

* CK Tile Engine GEMM Multi D : All Python files generated

* Partial Progress

* Partial Progress

* Partial Progress

* Partial Progress : Incorrect Result

* Partial Progress : Debugging

* Partial Progress : Correct Results

* Partial Progress - Incorrect Results

* Partial Progress - Commenting Passthrough bypass logic

* Changing Passthrough to MultiplyMultiply

* Correct Results!

* Fix and debug the pass through feature

* Sample commit

* Correct Results : MultiplyMultiply

* Code Cleanup

* Removing Failed Instances

* Working code before Unary element support

* Custom Elementwise Function support and working implementation for Mul and Add

* Updating README

* Working for Passthrough

* Review Comments : Minor Fixes

* Review Comments : Minor Fixes

* Readme Updated

* Partial Changes after Rebase

* Working Code : Changes after Rebase

* Updating Jenkins file

* Removing default value changed while testing

* Configuration changes in config files

* Tile Handler changes in GEMM Multi D Tile Engine

* Tile Handler changes in GEMM Multi D Example

* Change log for Gemm Multi D in CK Tile Engine

* Configuration changes in config files

---------

Co-authored-by: ThomasNing <thomasning@amd.com>

[ROCm/composable_kernel commit: 3f57ec3d2d]

2025-08-12 16:05:05 -07:00

CMakeLists.txt

fix the mi350 error (#2378 )

2025-06-20 12:50:13 -07:00

gemm_multi_d_fp16.cpp

GEMM Multi D for CK Tile Engine (#2660 )

2025-08-12 16:05:05 -07:00

gemm_multi_d_fp16.hpp

[CK_TILE] Introduces a new GEMM API that splits the existing basic GEMM class into multiple specialized classes. (#2520 )

2025-07-24 20:39:56 +02:00

README.md

[CK_TILE] Multiple-D GEMM example (#2219 )

2025-06-13 19:39:11 +02:00

run_gemm_multi_d_fp16_example.inc

[CK_TILE] Multiple-D GEMM example (#2219 )

2025-06-13 19:39:11 +02:00

utils.hpp

[CK_TILE] Multiple-D GEMM example (#2219 )

2025-06-13 19:39:11 +02:00

README.md

#Multiple D GEMM

This folder contains example for Multiple D GEMM using ck_tile tile-programming implementation.

build

#in the root of ck_tile
mkdir build && cd build
#you can replace < arch> with the appropriate architecture(for example gfx90a or gfx942) or \
    leave it blank
sh ../script/cmake-ck-dev.sh  ../ <arch>
#The basic pipeline method on the gemm calculation
make tile_example_gemm_multi_d_fp16 -j

This will result in an executable build/bin/tile_example_gemm_multi_d_fp16

example

args:
       -m  M dimensions - (Default: 3840)
       -n  N dimensions - (Default: 4096)
       -k  K dimensions - (Default: 4096)
-a_layout  Tensor A layout (default:R)
-b_layout  Tensor B layout (default:C)
-ds_layout Tensor D layout (default:R)
-e_layout  Tensor E layout (default:R)
-stride_a  Tensor A strides - (Default: 0)
-stride_b  Tensor B strides - (Default: 0)
-stride_e  Tensor C strides - (Default: 0)
-stride_ds Tensor D strides - (Default: 0)
-validate  0. No validation, 1. Validation on GPU. (Default: 1)
  -warmup  Number of iterations before benchmark the kernel. (Default: 10)
  -repeat  Number of iterations to benchmark the kernel. (Default: 100)
  -kbatch  kbatch for SplitK. (Default 1)