Commit Graph

26 Commits

Author SHA1 Message Date
Chao Liu
0a386c46a9 use more constexpr for Array 2019-06-06 19:26:08 -05:00
Chao Liu
7a89684f92 refactor 2019-06-06 16:50:35 -05:00
Chao Liu
709f13a6d7 use more constexpr 2019-06-04 20:00:48 -05:00
Chao Liu
917d7a2b1d use vectorized read and write for threadwise generic tensor copy 2019-06-03 18:58:01 -05:00
Chao Liu
b2439ec9dd adding implicit gemm v4 (nchw, kcyx) 2019-05-30 17:50:49 -05:00
Chao Liu
8a4b59785b adding implicit gemm v3 2019-05-22 19:39:56 -05:00
Chao Liu
5e5c27a63b adding implicit gemm v3 2019-05-16 13:22:40 -05:00
Chao Liu
4957d5a399 refactored 2019-05-02 14:49:20 -05:00
Chao Liu
569ad66e2a added implicit gemm v1r3 lds_double_buffer NCHW * CYXK = KNHW, reworked static functionals 2019-04-23 17:51:14 -05:00
Chao Liu
19f17df47a implicit gemm v1r2: adding support for nchw 2019-04-18 11:49:09 -05:00
Chao Liu
5245a0162b clean up 2019-04-06 16:27:07 -05:00
Chao Liu
f6cb5b846d debugging 2019-04-06 15:10:40 -05:00
Chao Liu
e2313c9eca tidy up 2019-04-02 20:30:00 -05:00
Chao Liu
6290e0b080 puting gridwise convolution into its own class 2019-04-02 20:18:01 -05:00
Chao Liu
e43d7bc63c refactor 2019-04-01 15:17:22 -05:00
Chao Liu
d6d9a8e4ce Jing's ds_read inline asm 2019-03-28 19:46:29 -05:00
Chao Liu
766b0a9eaf experimenting 2019-03-24 12:09:57 -05:00
Chao Liu
fdaaaa500c Merge branch 'direct_fp16' 2019-03-22 16:46:41 -05:00
Chao Liu
8c923db423 hip build 2019-03-22 14:22:58 -05:00
Chao Liu
79d9b1084b adding fp16 direct that reads pre-vectorized data 2019-03-18 18:16:02 -05:00
Chao Liu
4f0fc72e91 adding fp16 direct that reads pre-vectorized data 2019-03-18 15:03:17 -05:00
Chao Liu
a0584426ff refactoring ConstantTensorDescriptor 2019-03-17 03:22:41 -05:00
Chao Liu
2c9b8c2432 update hip build 2019-03-12 17:20:11 -05:00
Chao Liu
04c5527d07 refactor 2019-03-04 17:09:20 -06:00
Chao Liu
1cb9885058 add anther verision of batch gemm 2019-02-17 01:50:57 -06:00
Chao Liu
b2888adfbe change file extension to hip.hpp and hip.cpp 2019-02-15 02:13:21 -06:00