chenjun
26839ac17b
Ck tile/smoothquant out stride ( #1742 )
...
* add ck_tile/smoothquant out stride parameter
* Remove the default stride value
---------
Co-authored-by: so <a.com>
[ROCm/composable_kernel commit: 4e73177684 ]
2024-12-13 11:53:52 +08:00
rocking
e116bfef59
support max3 in smoothquant and add+ rmsnorm + rdquant ( #1654 )
...
* Fix cmake example build
* Support max3 in smoothquant one pass
* support max3 in two pass
* support max3 in add_rmsnorm_rdquant
[ROCm/composable_kernel commit: abae2afc72 ]
2024-11-27 05:01:15 +08:00
carlushuang
4fad52fea6
[CK_TILE]Moe update index ( #1672 )
...
* update MOCK_ID for moe-sorting
* add moe-smoothquant
* update a comment
* fix format
* hot fix
* update topk in overflow case
* update comments
* update bf16 cvt
---------
Co-authored-by: valarLip <340077269@qq.com >
[ROCm/composable_kernel commit: 36c7ce4e0e ]
2024-11-25 13:12:35 +08:00
rocking
4faf3ab587
[Ck_tile] smoothquant ( #1617 )
...
* fix compile error
* fix typo of padding
* Add smoothquant op
* Add smoothquant instance library
* refine type
* add test script
* Re-generate smoothquant.hpp
* Always use 'current year' in copyright
* use Generic2dBlockShape instead
* Add vector = 8 instance back
* Find exe path automatically
* Simplify the api condition
* Remove debugging code
* update year
* Add blank line between function declaration
* explicitly cast return value to dim3
* refine return value
* Fix default warmup and repeat value
* Add comment
* refactor sommthquant cmake
* Add README
* Fix typo
---------
Co-authored-by: Po Yen, Chen <PoYen.Chen@amd.com >
[ROCm/composable_kernel commit: fbd654545a ]
2024-11-01 13:51:56 +08:00