zjing14
|
b53e9d08ed
|
Batched GEMM for fp16 (#79)
* prepare host for batched_gemm
* init commit of batched kernels
* fixed
* refine transform with freeze
* m/n padding
* fixed a bug; clean
* add small tiles
* clean
* clean code
* clean code
* add nt, tn, tt layout
* add missing file
* use StaticBufferTupleOfVector instead
* add reference_batched_gemm
* fixed a macro
|
2022-02-11 09:36:52 -06:00 |
|