* chore(copyright): update copyright header for test directory
* chore(copyright): update copyright header for test directory
* chore(copyright): update copyright header for client_example directory
* chore(copyright): update copyright header for test directory
[ROCm/composable_kernel commit: c8563f2101]
* [CK_TILE][REGRESSION] Correct blockSize in Generic2dBlockShape (797b107dd90ce )
WarpPerBlock_M * WarpPerBlock_N are not equal with ThreadPerBlock_M * ThreadPerBlock_N /warpSize. we should calculate BlockSize from WarpPerBlock_M * WarpPerBlock_N
To compatible with wave32, function GetBlockSize is added to calculate correct size in host side.
* fix blocksize for all kernel related with generic2dblockshap
* remove constexpr for blocks
[ROCm/composable_kernel commit: b7a806f244]
BlockWarps, WarpTile in Generic2dBlockShape are wave size dependent, it causes mangled name mismatch between host and device side.
Solution: Replace them with ThreadPerBlock and move BlockWarps, WarpTile calculation into Generic2dBlockShape
[ROCm/composable_kernel commit: c254f3d7b4]
[CK_TILE] Add new ck tile unit test
* Add new ck tile unit test smoke-gemm-universal
* Add new ck tile unit test smoke-gemm-basic
* Add new ck tile unit test topk_softmax
* Add new ck tile unit test add_rmsnorm2d_rdquant_fwd
[ROCm/composable_kernel commit: f102eedfb3]