Gemm + bias + relu + add + layernorm (#272)

* Copy "gemm reduce" to "gemm bias add reduce"

* Implement gemm bias add reduction

* Fix compiler error due to merge from develop

* Add tensor operation for gemm + bias + add + reduce

* Add gemm_bais_add_reduce to ckProfiler

* Add c1 functor

* Refine type

* Use reduceAccDataType instead of explicitly float

* Change to use check_err()

* Do relu in float32 instead of bhalf_t. Because bhalf_t is unsigned

* Refactor relu. using type_trait instead of overloading

* Rename DxsReduceAccElementwiseOperation to DxsReduceAccElementwiseOperation

* Fix denominator

* Refine nameing

* Fix denominator  in host

* Remove useless include header

* Use AccDataType

* Fix static_cast order

* Refine type

* [What] Remove tuple type in the base class
[Why] External api depend on base class. if base class has relationship with type, we will need many class for different type

[ROCm/composable_kernel commit: 6eb5549923]
This commit is contained in:
rocking5566
2022-06-17 12:49:20 +08:00
committed by GitHub
parent cd3fbda406
commit 42db20408c
33 changed files with 3327 additions and 174 deletions

View File

@@ -11,6 +11,7 @@ int profile_gemm_bias_2d(int, char*[]);
int profile_gemm_bias_relu(int, char*[]);
int profile_gemm_bias_relu_add(int, char*[]);
int profile_gemm_reduce(int, char*[]);
int profile_gemm_bias_add_reduce(int, char*[]);
int profile_batched_gemm(int, char*[]);
int profile_grouped_gemm(int, char*[]);
int profile_conv_fwd(int, char*[]);
@@ -44,6 +45,10 @@ int main(int argc, char* argv[])
{
return profile_gemm_reduce(argc, argv);
}
else if(strcmp(argv[1], "gemm_bias_add_reduce") == 0)
{
return profile_gemm_bias_add_reduce(argc, argv);
}
else if(strcmp(argv[1], "batched_gemm") == 0)
{
return profile_batched_gemm(argc, argv);