Commit Graph

6 Commits

Author SHA1 Message Date
Aviral Goel
004784ef98 chore(copyright) update library wide CMakeLists.txt copyright header template (#3313)
* chore(copyright) update library wide CMakeLists.txt files copyright header template

* Fix build

---------

Co-authored-by: Sami Remes <samremes@amd.com>
2025-11-28 13:49:54 -08:00
rocking
28f68a5a99 layernorm & groupnorm bwd gamma beta (#1133)
* Add layernorm bwd gamma beta external api

* Add groupnorm external api

* Add layernorm bwd gamma beta profiler

* Add groupnorm bwd gamma beta ckProfiler

* Add layernorm & groupnorm bwd gamma beta test

* Fix groupnorm bwd gamma beta profiler bug

* Layernorm bwd weight client example

* Groupnorm bwd weight client example

* clang format

* Remove useless header

* Let inv_std be positive

* Rename to num_bytes and move this calculation outside the loop
2024-01-25 19:53:15 +08:00
rocking
a69aa2a11a layernorm and groupnorm backward data (#1083)
* rename folder

* Add type string

* Remove typo

* Add deviceOp to backward x

* Add comment to describe the behavior of backward normalization

* Add kernel function, prepare to implement

* implement generic kernel

* Check vector size

* Add sweep once pipeline for small reduce size

* Fix bug of KRaw_ error

* Fix bug of dx stride

* sanity check for mean and rstd

* backward x for groupnorm

* Add bwd x instance

* add layernorm 2d bwd gamma beta instances

* Change save mean var type from f32 to f16 in f16 mode

* Change the example to f16

* Add groupnorm bwd gamma beta instance

* Add groupnorm bwd x instance

* Fix naming

* Add layernorm bwd x ckprofiler

* Add groupnorm bwd x profiler

* clang format

* Rename bwd x to bwd data

* Fix bug of verification in profiler

* Add test of layernorm and groupnorm bwd data

* Add missing cmake

* Add layernorm2d bwd data

* rename fwd example

* Add groupnorm client example

* Fix typo. replace Invarient with Invariant

* Add checking before running the best instance
2023-12-19 04:23:11 +08:00
Illia Silin
7965d66a81 Split the static library into several files. (#1044)
* spolit the static library into several

* update lib paths and fix client example

* do not use device_mha_operarions for client examples

* use appropriate libs to link to client examples

* remove the gpu/transpose path from the list

* try fixing clinet examples 3,4,9

* add necessary libs for client examples

* fix the layernorm client example

* fix the client examples 23 and 24

* fix typo

* add interface library and refresh clang format
2023-11-28 11:17:37 -08:00
rocking
a3d9a2cd42 Layernorm4d (#1022)
* Rename folder

* Add layernorm 4d fwd example

* Rename original layernorm example

* Add layernorm 4d f16  test

* Add layernorm4d_fwd client example

* Support layernorm4D in ckProfiler

* Rename groupnorm to groupnorm fwd in example

* Rename layernorm and group fwd in test

* Rename normalization to normalization_fwd (instances)

* Add fwd to DeviceNormalization

* Rename external api header

* Rename folder, because we can also add bwd in this folder

* Add fwd in layernorm and groupnorm (profiler

* Fix compile error

---------

Co-authored-by: Po Yen Chen <PoYen.Chen@amd.com>
2023-11-09 08:34:51 +08:00
rocking5566
e1a3fff675 layernorm external api (#379)
* Add layernorm client example

* [What] Add default make install dir to gitignore
[Why] client example need to make install
2022-08-24 18:43:43 -05:00