[CK TILE] Support fp8/fp16 with pk_int4_t as data types for tensors A and B (#2805)

- Add support for tensor A/B in both fp16+pk_int4_t and fp8+pk_int4_t formats
- Implement A(bf8) B(i4) support in universal GEMM
- Use new implementation for i4 to fp8 conversion in Block Scale

[ROCm/composable_kernel commit: 82890192dd]
This commit is contained in:
Cong Ma
2025-09-09 17:40:52 -06:00
committed by GitHub
parent 22490acf0b
commit f7ffd111ee
15 changed files with 320 additions and 135 deletions

View File

@@ -125,7 +125,7 @@ CK_TILE_HOST_DEVICE fp32x2_t pk_int4_t_to_fp32x2_t_signed_conversion(const pk_in
float x_h = ((x_u8 & 0xf0) >> 4);
x_l = x_l > 7 ? x_l - 16 : x_l;
x_h = x_l > 7 ? x_l - 16 : x_l;
x_h = x_h > 7 ? x_h - 16 : x_h;
#ifdef CK_TILE_USE_PK4_LAYOUT_SHUFFLE
fp32x2_t res = {x_h, x_l};