elementwise op (#238)

* Add elementwise operation kernel and example

* Add comment

* Add template argument of dim . Prepare to support multiple dimension

* Rename example

* Support 1 dimension

* Add static assert

* Add comment

* Extract pad

* Remove redundant argument

* Support any dimension for elementwise operation

* Remove line

* Let it be the multiple number of CU

* Move thread per block to the parameter of constructor

* rename threadPerBlock with blockSize

* Support double

* rename kernel function name

* remove redundant include header

* Refine type

* Need to the final dimension

* Refine variable name

* Refine type

* Use index_t instead of int in API

Co-authored-by: rocking <chunylai@amd.com>
This commit is contained in:
rocking5566
2022-05-19 12:34:35 +08:00
committed by GitHub
parent 9f71ff48e2
commit aafc3ac27a
10 changed files with 759 additions and 0 deletions

View File

@@ -0,0 +1,25 @@
#pragma once
#include "data_type.hpp"
namespace ck {
namespace tensor_operation {
namespace binary_element_wise {
struct Add
{
__host__ __device__ constexpr void
operator()(double& dst, const double& src1, const double& src2) const
{
dst = src1 + src2;
}
__host__ __device__ constexpr void
operator()(float& dst, const float& src1, const float& src2) const
{
dst = src1 + src2;
}
};
} // namespace binary_element_wise
} // namespace tensor_operation
} // namespace ck