Chao Liu
|
cd167e492a
|
Compile for gfx908 and gfx90a (#130)
* adding compilation for multiple targets
* fix build
* clean
* update Jekinsfile
* update readme
* update Jenkins
* use ck::half_t instead of ushort for bf16
* rename enum classes
* clean
* rename
* clean
|
2022-03-31 12:33:34 -05:00 |
|
Chao Liu
|
5d37d7bff4
|
Reorganize files, Part 1 (#119)
* delete obselete files
* move files
* build
* update cmake
* update cmake
* fix build
* reorg examples
* update cmake for example and test
|
2022-03-08 21:46:36 -06:00 |
|
Chao Liu
|
e823d518cb
|
ckProfiler and device-level XDL GEMM operator (#48)
* add DeviceGemmXdl
* update script
* fix naming issue
* fix comment
* output HostTensorDescriptor
* rename
* padded GEMM for fwd v4r4r4 nhwc
* refactor
* refactor
* refactor
* adding ckProfiler
* adding ckProfiler
* refactor
* fix tuning parameter bug
* add more gemm instances
* add more fp16 GEMM instances
* fix profiler driver
* fix bug in tuning parameter
* add fp32 gemm instances
* small fix
* refactor
* rename
* refactor gemm profiler; adding DeviceConv and conv profiler
* refactor
* fix
* add conv profiler
* refactor
* adding more GEMM and Conv instance
* Create README.md
Add build instruction for ckProfiler
* Create README.md
Add Readme for gemm_xdl example
* Update README.md
Remove build instruction from top most folder
* Update README.md
* clean up
|
2021-11-14 11:28:32 -06:00 |
|