Chao Liu
|
4f0fc72e91
|
adding fp16 direct that reads pre-vectorized data
|
2019-03-18 15:03:17 -05:00 |
|
Chao Liu
|
03eef73c5b
|
refactoring block copy
|
2019-03-17 15:36:38 -05:00 |
|
Chao Liu
|
fd8de38417
|
refactor
|
2019-03-16 10:50:46 -05:00 |
|
Chao Liu
|
ce0182ce05
|
Merge branch 'master' into implicit_gemm_fp16
|
2019-03-09 13:46:47 -06:00 |
|
Chao Liu
|
7a97087713
|
refactor
|
2019-03-09 12:59:47 -06:00 |
|
Chao Liu
|
8edbc659b8
|
refactor
|
2019-03-06 12:34:31 -06:00 |
|
Chao Liu
|
04c5527d07
|
refactor
|
2019-03-04 17:09:20 -06:00 |
|
Chao Liu
|
5fd40ad768
|
clean up
|
2019-03-02 17:27:37 -06:00 |
|
Chao Liu
|
4543d17a71
|
refactor
|
2019-02-19 22:07:15 -06:00 |
|
Chao Liu
|
b2b622e8b2
|
refactor
|
2019-02-19 20:34:21 -06:00 |
|
Chao Liu
|
a65ef90308
|
device_implicit_gemm_convolution_1_chwn_csrk_khwn: use tensor copy (instead of pointwise) for writing output, 3x3 increased from 78% to 84%, 5x5 from 80% to 84%
|
2019-02-19 11:47:46 -06:00 |
|
Chao Liu
|
1cb9885058
|
add anther verision of batch gemm
|
2019-02-17 01:50:57 -06:00 |
|
Chao Liu
|
9f2e8f8bb4
|
2-type implicit gemm using chwn
|
2019-02-15 22:51:51 -06:00 |
|
Chao Liu
|
d7c84daf66
|
delete useless code
|
2019-02-15 22:24:18 -06:00 |
|
Chao Liu
|
b2888adfbe
|
change file extension to hip.hpp and hip.cpp
|
2019-02-15 02:13:21 -06:00 |
|
Chao Liu
|
a414e3fdf8
|
update build
|
2019-02-15 02:06:34 -06:00 |
|