Chao Liu
|
1feebc85ac
|
fix ReLU formula (#61)
* fix relu
* clean up
* clean up
[ROCm/composable_kernel commit: fd3d907a80]
|
2021-12-04 16:05:29 -06:00 |
|
Chao Liu
|
4a141b5a04
|
GEMM/Conv+BiasAdd+ReLU+Add (#55)
* gemm+activation
* move C pointwise operation into threadwise copy
* add pointwise operation to A/B matrix
* update ckProfiler
* adding bias add
* adding bias add
* adding bias add
* added bias add; worked around compiler issues
* clean up
* clean up
* Update README.md
* Update README.md
* Update README.md
* clean up
* add conv_xdl example
* adding conv_xdl_bias_relu_add example
* add conv+bias+relu+add, but has register spill issue
* tweak
* tweak
* refactor
* Update README.md
update readme for example/2_gemm_xdl_bias_relu_add
* clean up
* Update README.md
update readme for example/3_conv_xdl
* Update README.md
[ROCm/composable_kernel commit: 41cdd3801a]
|
2021-12-02 20:07:37 -06:00 |
|
Chao Liu
|
9bf8189530
|
Use __builtin_memcpy to implement bit_cast and for accessing vector from pointer of scalars (#53)
* reworking vector_type
* use __builtin_memcpy for bit_cast and vector access of scalar pointer
* clean up
[ROCm/composable_kernel commit: 64350affc5]
|
2021-11-18 09:11:15 -06:00 |
|
Chao Liu
|
2f5ccb68f5
|
ckProfiler and device-level XDL GEMM operator (#48)
* add DeviceGemmXdl
* update script
* fix naming issue
* fix comment
* output HostTensorDescriptor
* rename
* padded GEMM for fwd v4r4r4 nhwc
* refactor
* refactor
* refactor
* adding ckProfiler
* adding ckProfiler
* refactor
* fix tuning parameter bug
* add more gemm instances
* add more fp16 GEMM instances
* fix profiler driver
* fix bug in tuning parameter
* add fp32 gemm instances
* small fix
* refactor
* rename
* refactor gemm profiler; adding DeviceConv and conv profiler
* refactor
* fix
* add conv profiler
* refactor
* adding more GEMM and Conv instance
* Create README.md
Add build instruction for ckProfiler
* Create README.md
Add Readme for gemm_xdl example
* Update README.md
Remove build instruction from top most folder
* Update README.md
* clean up
[ROCm/composable_kernel commit: e823d518cb]
|
2021-11-14 11:28:32 -06:00 |
|