embedding fuse layernorm (#405)

* add gridwise/device sparse embedding

* update code

* update code

* remove useless makefile

* code fix

* workable

* work properly

* emb add

* add more instance

* format

* remove useless code

* fix format

* fix clang-tidy

* clean

* fix a compile error

Co-authored-by: Chao Liu <chao.liu2@amd.com>
Co-authored-by: Chao Liu <lc.roy86@gmail.com>
This commit is contained in:
carlushuang
2022-09-09 23:41:15 +08:00
committed by GitHub
parent d6709dc373
commit efd1d25733
7 changed files with 998 additions and 0 deletions

View File

@@ -51,4 +51,5 @@ add_subdirectory(32_batched_gemm_scale_softmax_gemm)
add_subdirectory(33_multiple_reduce)
add_subdirectory(34_batchnorm)
add_subdirectory(35_splitK_gemm)
add_subdirectory(36_sparse_embedding)
add_subdirectory(41_grouped_conv_conv_fwd)