so
|
8af425fc0a
|
fix v0 and wave id calc
|
2025-01-11 09:56:48 +00:00 |
|
shengnxu
|
339a674b59
|
current status: single WG, memory out of bound
|
2025-01-11 11:10:17 +08:00 |
|
shengnxu
|
5d00b37e6b
|
fix loop cnt and half d buffer size
|
2025-01-07 07:42:05 +00:00 |
|
shengnxu
|
2a66e08059
|
fix some issue, next step, res recalc,
|
2025-01-06 14:54:07 +00:00 |
|
shengnxu
|
7cc808f28d
|
fix some codes
|
2025-01-05 16:08:50 +00:00 |
|
shengnxu
|
6f7d127267
|
changed all the scale outside except for uq
|
2025-01-05 10:31:17 +00:00 |
|
shengnxu
|
9a46c0e78e
|
move a scale out inline
|
2025-01-04 10:20:01 +00:00 |
|
shengnxu
|
d0c80b12da
|
fix more issues, current status, inline asm using more register than available
|
2025-01-03 09:17:08 +00:00 |
|
shengnxu
|
a759277dc1
|
fix some error
|
2025-01-02 19:14:01 +08:00 |
|
shengnxu
|
f549173bc0
|
simple gemm2 for gemm1 debuggging
|
2025-01-01 22:41:19 +08:00 |
|
shengnxu
|
811b75d3a0
|
staging code for backup
|
2024-12-31 18:34:02 +08:00 |
|
shengnxu
|
1c9a5ff42b
|
formater
|
2024-12-12 10:52:47 +00:00 |
|
root
|
e2be4b9e91
|
added moe interleaving pipeline
|
2024-12-03 03:46:22 +00:00 |
|
carlushuang
|
440e28b08f
|
[CK_TILE] fused-moe first version (#1634)
* moe pipeline
* update code
* compile OK
* update
* update cpu reference
* update pipeline_gemm0
* compiler ok
* update pipeline
* rename to ex pipeline
* block-asm
* update
* update
* update first gemm ok
* compute correct
* update file structure
* update README
* update
* update
* update code
* update API
* return unsupport case
* add comment
* update readme
* update
* uncomment
* update
* fix build err
---------
Co-authored-by: valarLip <340077269@qq.com>
|
2024-11-26 11:14:56 +08:00 |
|
carlushuang
|
36c7ce4e0e
|
[CK_TILE]Moe update index (#1672)
* update MOCK_ID for moe-sorting
* add moe-smoothquant
* update a comment
* fix format
* hot fix
* update topk in overflow case
* update comments
* update bf16 cvt
---------
Co-authored-by: valarLip <340077269@qq.com>
|
2024-11-25 13:12:35 +08:00 |
|
dummycoderfe
|
bec6fbc65f
|
Ck tile/moe sorting (#1624)
* add moe_sorting & check ok
* fix comments & typo
* Run remod.py under include/ck_tile & example/ck_tile directories
* format codes
* fix output ci check bug
* fix moe sorting readme and error commit file
* use magiv div to accelerate compute
* add an loop unroll for moe lds ops
* add extblocksnel to set zeros for moebufs
* [Ck_tile] moe set zero run ok, add size check and fix ref check
* [Ck_tile]fix moe_sorting fuse set_zero remod
* [Ck_tile] change name style, fix zero buffer size err, change folder
* [Ck_tile] moe_sorting: fix name style
* [Ck_tile] moe_sorting, remove useless params in traits
* [Ck_tile] change outputtile cnt * unit_size; change output buf alloc
---------
Co-authored-by: dummycoderfe <noplydummmycoder@163.com>
Co-authored-by: Po Yen, Chen <PoYen.Chen@amd.com>
Co-authored-by: carlushuang <carlus.huang@amd.com>
|
2024-11-09 17:57:27 +08:00 |
|