composable_kernel

mirror of https://github.com/ROCm/composable_kernel.git synced 2026-03-23 16:47:40 +00:00

Files

msaffari-amd 2d3020e5b0 [CK Tile] batched contraction kernel generalizing (#3126 )

* Add help for example

* Refactore the compute reference batched contraction to manage stride-aware calculation and some code cleanings

* Add stride-aware reference for batched contraction with independent D tensor layouts

* Add -num_d argument for runtime D tensor count selection in batched contraction

* Add stride vector arguments in example code for testing non-contiguous batched contraction inputs

* Add descriptor-based architecture for batched contraction multi-dimensional stride support

* Add multi-dimensional non-contiguous stride support to batched contraction, num_d = 0

* Add complete multi-dimensional stride support via descriptors

* Enable vectorization in descriptor-based batched contraction. Add pad_tensor_view to local RunGemm

* Clean up batched contraction: remove old UniversalGemmKernel path

* Clean up batched contraction: remove legacy paths and finalize docs

* Optimize batched contraction example: pass dimension sizes not vectors

* correct the reference calculation, unsigned int to int

* Fix batched_contraction C++17 build errors for gfx90a CI

2025-12-02 13:30:27 +01:00

kernel

[CK Tile] batched contraction kernel generalizing (#3126 )

2025-12-02 13:30:27 +01:00

pipeline

chore(copyright): update copyright header for include directory (#3293 )

2025-11-26 11:00:05 -07:00

utils

[CK Tile] batched contraction kernel generalizing (#3126 )

2025-12-02 13:30:27 +01:00