Chao Liu
|
52c3fe05be
|
Refactor for MIOpen integration (#4)
Refactor, so can bring multi-index transformation and padding support into MIOpen
|
2019-10-11 11:37:31 -05:00 |
|
Chao Liu
|
184c6e7d37
|
nvidia build
|
2019-09-20 21:45:03 -05:00 |
|
Chao Liu
|
b6e1c52a80
|
use buffer_load buffer_store intrinsic
|
2019-09-19 15:39:07 -05:00 |
|
Chao Liu
|
86cc678f18
|
add global_load and buffer_load inline asm
|
2019-09-18 15:41:55 -05:00 |
|
Chao Liu
|
5b7a18c506
|
experimenting global and buffer load/store
|
2019-09-18 02:05:42 -05:00 |
|
Chao Liu
|
c7a6545ec4
|
experimenting global and buffer load/store
|
2019-09-18 01:37:28 -05:00 |
|
Chao Liu
|
9f46cdf5fa
|
experimenting global and buffer load/store
|
2019-09-18 00:15:57 -05:00 |
|
Chao Liu
|
08cbac98cc
|
added (1x4)x(2x4) threadwise gemm
|
2019-07-30 18:20:55 -05:00 |
|
Chao Liu
|
21f7e9f103
|
refactor
|
2019-06-19 17:43:56 -05:00 |
|
Chao Liu
|
23f633cdc5
|
clean up for miopen
|
2019-06-17 20:14:18 -05:00 |
|
Chao Liu
|
3c0646d490
|
bring back some inline asm
|
2019-06-17 17:28:24 -05:00 |
|
Chao Liu
|
33d1e0e2e5
|
refactoring for miopen
|
2019-06-17 14:58:44 -05:00 |
|
Chao Liu
|
1566b31736
|
reorginzed files
|
2019-06-13 15:12:12 -05:00 |
|