update layernorm (#1570)

mirror of https://github.com/ROCm/composable_kernel.git synced 2026-04-20 14:59:17 +00:00

* port layernorm

* change warp_welford.hpp

* Update warpshuffle

* 1. Add save mean and save std back
2. Move construction of tensor_view and tile_window to operator()

* refine welford max count calculation

* unify layernorm api

* Rename file

* Remove save mean and inv std

* Revert "refine welford max count calculation"

This reverts commit 022365802b.

* Fix order of parameter

* refine welford max count calculation again

* Remove fp32 instances

* Fix bug of padding

* refactor api

* Support bf16

* Extract common function

* Refine arg of operator()

* Add kMThreadPerBlock to template parameter

* clang format

* Refine variable name

* Refine file name

* remove redundant line

* refactor layernorm2d pipeline and add block-per-block utility

* fix name

* rename more

* add more block-per-tile instance

* remove duplicated define

* update instance for 2048, 1024 case

* support up to 2048 now

* opt loading

* add n1536

* Add two pass pipeline

* format

* Fix incorrect type

* parallel compilation

* Use smaller N

* fix 2p pass

* Support Repeat_M in distribution

* Refine nameing

* Add reduce example

---------

Co-authored-by: letaoqin <letaoqin@amd.com>
Co-authored-by: aska-0096 <haocwang@amd.com>
Co-authored-by: rocking <ChunYu.Lai@amd.com>
Co-authored-by: carlushuang <carlus.huang@amd.com>

This commit is contained in:

ltqin

2024-10-22 09:26:18 +08:00

committed by

GitHub

parent 3f710930f6

commit 0394f8a713

59 changed files with 2917 additions and 1042 deletions

									
										1

include/ck_tile/core.hpp
									
												View File
												
				@@ -52,6 +52,7 @@

				#include "ck_tile/core/tensor/update_tile.hpp"

				#include "ck_tile/core/utility/bit_cast.hpp"

				#include "ck_tile/core/utility/functional.hpp"

				#include "ck_tile/core/utility/functional_with_tuple.hpp"

				#include "ck_tile/core/utility/ignore.hpp"

				#include "ck_tile/core/utility/magic_div.hpp"

				#include "ck_tile/core/utility/philox_rand.hpp"

update layernorm (#1570)

1 include/ck_tile/core.hpp Unescape Escape View File

1

include/ck_tile/core.hpp

View File