[rocm-libraries] ROCm/rocm-libraries#4451 (commit 091bf0f)

[CK_TILE] Blockscale Gemm Fix Multi-Arch Compilation

## Motivation
This PR updates CK_TILE blockscale GEMM-quant kernels and launch helpers
to compile across multiple GPU architectures by introducing compile-time
availability gating and a new attribute tag mechanism for kernel
symbol/attribute specialization.

## Technical Details
- Add an architecture-guarded `kIsAvailable` flag to the gfx950 pipeline
and propagate availability handling into `QuantGemmKernel`.
- Extend `make_kernel`/`kentry` to accept an `Attr` tag enabling
per-kernel compile-time attributes (e.g., `no-packed-fp32-ops`) and
unique symbols.
- Update the blockscale GEMM quant example to pass kernel attributes and
adjust gfx950 gating.

## Test Plan
- CI
- Local test: `cmake .. --preset dev -DGPU_TARGETS='gfx942;gfx950'
-GNinja && ninja tile_example_gemm_quant`
- Local test with ROCm/aiter#1954
## Test Result

<!-- Briefly summarize test outcomes. -->

## Submission Checklist

- [x] Look over the contributing guidelines at
https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.
This commit is contained in:
Yi DING
2026-02-10 12:42:19 +00:00
committed by assistant-librarian[bot]
parent 6a6cd05dbb
commit d5acfd8d52
6 changed files with 90 additions and 57 deletions

View File

@@ -26,6 +26,11 @@ struct ABQuantGemmPipelineAgBgCrAsync : public BaseGemmPipelineAgBgCrCompV3<Prob
{
using Base = BaseGemmPipelineAgBgCrCompV3<Problem>;
using PipelineImplBase = GemmABQuantPipelineAgBgCrImplBase<Problem, Policy>;
#if defined(__gfx950__)
static constexpr bool kIsAvailable = true;
#else
static constexpr bool kIsAvailable = false;
#endif
using ADataType = remove_cvref_t<typename Problem::ADataType>;
using AQDataType = remove_cvref_t<typename Problem::AQDataType>;