Commit Graph

16 Commits

Author SHA1 Message Date
so
8af425fc0a fix v0 and wave id calc 2025-01-11 09:56:48 +00:00
shengnxu
339a674b59 current status: single WG, memory out of bound 2025-01-11 11:10:17 +08:00
shengnxu
5d00b37e6b fix loop cnt and half d buffer size 2025-01-07 07:42:05 +00:00
shengnxu
2a66e08059 fix some issue, next step, res recalc, 2025-01-06 14:54:07 +00:00
shengnxu
7cc808f28d fix some codes 2025-01-05 16:08:50 +00:00
shengnxu
6f7d127267 changed all the scale outside except for uq 2025-01-05 10:31:17 +00:00
shengnxu
9a46c0e78e move a scale out inline 2025-01-04 10:20:01 +00:00
shengnxu
d0c80b12da fix more issues, current status, inline asm using more register than available 2025-01-03 09:17:08 +00:00
shengnxu
a759277dc1 fix some error 2025-01-02 19:14:01 +08:00
shengnxu
f549173bc0 simple gemm2 for gemm1 debuggging 2025-01-01 22:41:19 +08:00
shengnxu
811b75d3a0 staging code for backup 2024-12-31 18:34:02 +08:00
shengnxu
1c9a5ff42b formater 2024-12-12 10:52:47 +00:00
root
e2be4b9e91 added moe interleaving pipeline 2024-12-03 03:46:22 +00:00
carlushuang
440e28b08f [CK_TILE] fused-moe first version (#1634)
* moe pipeline

* update code

* compile OK

* update

* update cpu reference

* update pipeline_gemm0

* compiler ok

* update pipeline

* rename to ex pipeline

* block-asm

* update

* update

* update first gemm ok

* compute correct

* update file structure

* update README

* update

* update

* update code

* update API

* return unsupport case

* add comment

* update readme

* update

* uncomment

* update

* fix build err

---------

Co-authored-by: valarLip <340077269@qq.com>
2024-11-26 11:14:56 +08:00
carlushuang
36c7ce4e0e [CK_TILE]Moe update index (#1672)
* update MOCK_ID for moe-sorting

* add moe-smoothquant

* update a comment

* fix format

* hot fix

* update topk in overflow case

* update comments

* update bf16 cvt

---------

Co-authored-by: valarLip <340077269@qq.com>
2024-11-25 13:12:35 +08:00
dummycoderfe
bec6fbc65f Ck tile/moe sorting (#1624)
* add moe_sorting & check ok

* fix comments & typo

* Run remod.py under include/ck_tile & example/ck_tile directories

* format codes

* fix output ci check bug

* fix moe sorting readme and error commit file

* use magiv div to accelerate compute

* add an loop unroll for moe lds ops

* add extblocksnel to set zeros for moebufs

* [Ck_tile] moe set zero run ok, add size check and fix ref check

* [Ck_tile]fix moe_sorting fuse set_zero remod

* [Ck_tile] change name style, fix zero buffer size err, change folder

* [Ck_tile] moe_sorting: fix name style

* [Ck_tile] moe_sorting, remove useless params in traits

* [Ck_tile] change outputtile cnt * unit_size; change output buf alloc

---------

Co-authored-by: dummycoderfe <noplydummmycoder@163.com>
Co-authored-by: Po Yen, Chen <PoYen.Chen@amd.com>
Co-authored-by: carlushuang <carlus.huang@amd.com>
2024-11-09 17:57:27 +08:00