bf16A_Int8B with fastgelu/bias (#1264)

* changed the copy function to v7r2

* adding multi_abd

* in-progress

* add post-load oob check

* debugging

* adjust instances

* add run_lds

* add elemntwise_op

* replace multi_abd_device with v3

* clean up

* clean

* clean

* Added LDSType

* profiling

* adjust oobcheck

* add missing file

* refactor

* clean

* add examples
This commit is contained in:
zjing14
2024-04-26 07:26:30 -05:00
committed by GitHub
parent b4032629e5
commit 0d0150db20
37 changed files with 4752 additions and 970 deletions