[ck] correct memory size in grouped_gemm_multi_abd_xdl_fixed_nk_bias_bf16_i8 (#3168)

b1 and b0 use same layout,  so, the size of b1_tensors_device should be same with b0_tensors_device's

[ROCm/composable_kernel commit: e593a14ae1]
This commit is contained in:
linqunAMD
2025-11-11 02:58:08 +08:00
committed by GitHub
parent 5f9d5566e5
commit 93b4c77e06

View File

@@ -221,8 +221,8 @@ bool run_grouped_gemm(const ProblemSize& problem_size, const ExecutionConfig& co
b0_tensors_device.emplace_back(std::make_unique<DeviceMem>(
sizeof(B0DataType) * problem_size.Ns[i] * problem_size.Ks[i]));
b1_tensors_device.emplace_back(
std::make_unique<DeviceMem>(sizeof(B1DataType) * problem_size.Ns[i]));
b1_tensors_device.emplace_back(std::make_unique<DeviceMem>(
sizeof(B1DataType) * problem_size.Ns[i] * problem_size.Ks[i]));
d0_tensors_device.emplace_back(
std::make_unique<DeviceMem>(sizeof(D0DataType) * problem_size.Ns[i]));