Chao Liu
|
e2753e68bd
|
Dynamic tensor descriptor (#24)
* support dynamic tensor descriptor
* use buffer load OOB feature for padding case
* add navi support
* add int8x4 inference kernel
Co-authored-by: Chao Liu <chao@ixt-rack-81.local.lan>
Co-authored-by: Jing Zhang <jizhan@amd.com>
[ROCm/composable_kernel commit: fcbb978828]
|
2021-03-25 13:51:11 -05:00 |
|
Chao Liu
|
9695b9f548
|
added GetLinearDimensionMask
[ROCm/composable_kernel commit: e1ae8f18f7]
|
2019-09-25 02:52:41 -05:00 |
|
Chao Liu
|
c3a1be3865
|
WIP: explicitly separate offset component into compile-time, block-invariant and per-thread components
[ROCm/composable_kernel commit: 51884fc214]
|
2019-09-21 22:53:03 -05:00 |
|
Chao Liu
|
d89b9c2e08
|
amd build
[ROCm/composable_kernel commit: 69fea593ec]
|
2019-09-15 17:55:46 -05:00 |
|
Chao Liu
|
17564ecfec
|
adding merge transform
[ROCm/composable_kernel commit: ca42e9101d]
|
2019-09-10 01:53:49 -05:00 |
|
Chao Liu
|
399be319a2
|
more utility code
[ROCm/composable_kernel commit: 7a7fe16086]
|
2019-09-09 00:29:33 -05:00 |
|
Chao Liu
|
b0f3708397
|
added tuple
[ROCm/composable_kernel commit: 625838def0]
|
2019-09-06 18:07:56 -05:00 |
|
Chao Liu
|
572dd9ae5c
|
adding dimension transformation
[ROCm/composable_kernel commit: bd44e6390d]
|
2019-09-02 00:21:00 -05:00 |
|