zjing14
7589116121
[Hotfix] SplitK Gemm fp32 ( #401 )
...
* add scripts
* fixed splitK_gemm_fp32
* clean
* clean
* use gemm_xdl_splitK_c_shuffle into profiler
* remove device_gemm_xdl_splitk.hpp
2022-09-02 11:16:09 -05:00
Chao Liu
500fa99512
Clean up conv example, Instances, profiler and test ( #324 )
...
* convnd_fwd fp16 example
* update example
* update example
* update instance
* updating refernce conv
* update reference conv
* update conv fwd profiler
* update conv 1d and 3d instance
* update include path
* clean
* update profiler for conv bwd data and weight
* update conv bwd weight
* clean
* update conv example
* update profiler for conv bwd weight
* update ckprofiler for conv bwd data
* fix reference conv bwd data bug; update conv bwd data test
* update examples
* fix initialization issue
* update test for conv fwd
* clean
* clean
* remove test case too sensitive to error threshhold
* fix test
* clean
* fix build
* adding conv multiple d
* adding conv multiple D
* add matrix padder
* add gemm padding to convnd
* adding group conv
* update gemm multi-d
* refactor
* refactor
* refactor
* clean
* clean
* refactor
* refactor
* reorg
* add ds
* add bias
* clean
* add G
* adding group
* adding group
* adding group
* update Tensor
* clean
* update example
* update DeviceGemmMultipleD_Xdl_CShuffle
* update conv bwd-data and bwd-weight
* upate contraction example
* update gemm and batch gemm with e permute
* fix example build
* instance for grouped conv1d
* update example
* adding group conv instance
* update gemm bilinear instance
* update gemm+add+add+fastgelu instance
* update profiler
* update profiler
* update test
* update test and client example
* clean
* add grouped conv into profiler
* update profiler
* clean
* add test grouped conv, update all conv test to gtest
* update test
2022-07-29 18:19:25 -05:00
Chao Liu
0dcb3496cf
Improve external interface for GEMM and GEMM+add+add+fastgelu ( #311 )
...
* interface for GEMM and GEMM+add+add+fastgelu
* rename namespace
* instance factory
* fix build
* fix build; add GEMM client example
* clean
2022-06-30 22:11:00 -05:00
Chao Liu
aebd211c36
External Interface ( #304 )
...
* add client example
* clean
* clean
* reorg
* clean up profiler
* reorg
* clea
* fix profiler
* function for getinstances
* update client example
* update client example
* update client example
* update
* update example
* update Jenkins file
* update cmake
* update Jenkins
2022-06-26 19:39:02 -05:00
Chao Liu
d3051d7517
add license in file ( #303 )
2022-06-24 23:32:43 -05:00
Chao Liu
d1db6a0c3e
Absolute include path ( #281 )
...
* ad gelu and fast_gelu
* added GeLU and fast GeLU
* clean up
* add gemm+fastgelu example
* add gemm+gelu instances
* update profiler
* clean up
* clean up
* adding gemm+bias+activation
* clean
* adding bias
* clean
* adding gemm multiple d
* debugging
* add gemm bias add fastgelu
* rename, clean
* refactoring; add readme
* refactor
* refactor
* refactor
* refactor
* refactor
* refactor
* fix
* fix
* update example
* update example
* rename
* update example
* add ckProfiler
* clean
* clean
* clean
* clean
* add client app example
* update readme
* delete obselete files
* remove old client app
* delete old file
* cleaning
* clean
* remove half
* fix header path
* fix header path
* fix header path
* fix header path
* fix header path
* fix header path for all examples
* fix header path
* fix header path
* fix header path
* fix header path
* fix header path
* fix header path
* fix header path
* fix header path
* fix header path
* revert client app example
* clean build
* fix build
* temporary disable client test on Jenkins
* clean
* clean
* clean
2022-06-24 20:51:04 -05:00
JD
cec69bc3bc
Add host API ( #220 )
...
* Add host API
* manually rebase on develop
* clean
* manually rebase on develop
* exclude tests from all target
* address review comments
* update client app name
* fix missing lib name
* clang-format update
* refactor
* refactor
* refactor
* refactor
* refactor
* fix test issue
* refactor
* refactor
* refactor
* upate cmake and readme
Co-authored-by: Chao Liu <chao.liu2@amd.com >
2022-05-12 09:21:01 -05:00
myamlak
f03a1738d9
Resolution of issue #153 : Add compiler warning on comparing int and size_t ( #212 )
...
* Turning compare warnings on
* Cleaning part I
* Cleaning part II
* Explicit static_cast to ck::type_convert
* Resolving large tensor size issue.
* format
* revert change to tensor descriptor; promote lementSpaceSize to 64bit
* use integer value for GEMM test
* Review remarks
* Review remarks + issues with (un)signed arithmetic
* Format fix
* Format
* Clang-format.
* fix 2gb limit issue
Co-authored-by: Chao Liu <chao.liu2@amd.com >
Co-authored-by: Adam Osewski <aosewski@amd.com >
2022-05-09 15:06:49 -05:00
Anthony Chang
f015c77687
use single threaded tensor generator ( #161 )
2022-03-30 22:28:30 -05:00
Chao Liu
f95267f166
Gemm+Reduce Fusion ( #128 )
...
* add gridwise gemm v4r1
* rename
* adding gemm+reduce
* adding gemm+reduce
* adding gemm+reduce
* adding gemm+reduce
* use sfc in shuffling
* remove hardcode
* remove hardcode
* refactor
* fix build
* adding gemm+reduce
* adding gemm+reduce
* adding gemm+reduce
* adding gemm+reduce
* adding gemm+reduce
* format
* clean
* adding gemm+reduce
* adding profiler for gemm+reduce
* adding gemm+reduce profiler
* fix build
* clean up
* gemm+reduce
* fix build
* update DeviceGemm_Xdl_CShuffle; update enum to enum class
* clean up
* add test for gemm+reduce
* clean up
* refactor
* fix build
* fix build
2022-03-23 22:18:42 -05:00
Chao Liu
5d37d7bff4
Reorganize files, Part 1 ( #119 )
...
* delete obselete files
* move files
* build
* update cmake
* update cmake
* fix build
* reorg examples
* update cmake for example and test
2022-03-08 21:46:36 -06:00