Fix grouped_gemm_splitk kernels on MI300. (#694)

* replace amd_buffer_atomic_add with hip_atomic_add

* fix grouped_gemm_splitk kernels on mi300

* fix syntax

* revert experimental atomic_add changes

---------

Co-authored-by: Jing Zhang <jizhan@amd.com>

[ROCm/composable_kernel commit: 4a51d2da9d]
This commit is contained in:
Illia Silin
2023-05-03 08:25:25 -07:00
committed by GitHub
parent 87b21d95fa
commit ffac25fffa
2 changed files with 3 additions and 2 deletions

View File

@@ -147,7 +147,7 @@ bool run_grouped_gemm(const ProblemSize& problem_size, const ExecutionConfig& co
#else
a_tensors_device[i]->ToDevice(a_tensors[i].mData.data());
b_tensors_device[i]->ToDevice(b_tensors[i].mData.data());
c_tensors_device[i]->SetZero();
c_tensors_device[i]->SetZero();
#endif
p_a.push_back(a_tensors_device[i]->GetDeviceBuffer());