Commit Graph

807 Commits

Author SHA1 Message Date
coderfeli
d87ddebb30 revert back to v1 2025-02-25 03:06:55 +00:00
coderfeli
6934ac0466 v3 ok 2025-02-24 14:35:16 +00:00
coderfeli
1648a5e029 merge 2025-02-24 11:50:30 +00:00
coderfeli
cf9be76c73 hot fix eid 2025-02-24 10:50:59 +00:00
coderfeli
d5b2c900b9 fix sorting bug 2025-02-21 06:56:09 +00:00
coderfeli
6c09a5970d fix typo 2025-02-20 22:30:20 +08:00
coderfeli
70aef2ed33 hot fix expert over flow 2025-02-20 13:54:44 +00:00
coderfeli
6cb9c0e67c hot fix eid 2025-02-20 10:19:06 +00:00
coderfeli
c881c378f2 change inst cnt 2025-02-20 07:57:12 +00:00
coderfeli
766e36610b modify sorting to use cumsum 2025-02-20 07:55:01 +00:00
coderfeli
b6fbe4f7fe debug block v3 2025-02-20 07:05:48 +00:00
coderfeli
d5e395918e v3 build ok 2025-02-20 04:03:37 +00:00
coderfeli
76644d70dd fix instruction seq 2025-02-19 11:04:21 +00:00
coderfeli
a41f76cdea add dump 2025-02-19 10:39:12 +00:00
Mingtao Gu
1d46f8f63d Merge pull request #1901 from ROCm/mtgu/dev/ck_moe_gemm2_int4_dev
enable pipeline v3.
2025-02-19 14:23:59 +08:00
mtgu0705
11eaac1932 enable pipeline v3. 2025-02-19 14:22:39 +08:00
coderfeli
24d8024f0e fix nswizzle = true 2025-02-19 05:17:26 +00:00
coderfeli
9ff2394e26 fix swizzle = false 2025-02-18 17:43:00 +08:00
mtgu0705
854cd8b4a1 commit missing files 2025-02-18 16:29:26 +08:00
mtgu0705
182e7480ba Split the blockwise pipeline for fp8xint4. 2025-02-18 15:38:05 +08:00
mtgu0705
966f9051c7 fixed merge issue. fp8xint4 and fp8xint4_bpreshuffle function pass. 2025-02-18 13:54:15 +08:00
mtgu0705
49bac8cef7 Added b preshuffle pipeline v3 support. 2025-02-18 13:34:55 +08:00
mtgu0705
a0432459e7 Added moe_pk_i4_gemm2, function pass. 2025-02-18 13:34:41 +08:00
mtgu0705
be79b63bfe fix bug in moe_gemm1.cpp, now function pass. 2025-02-18 13:32:46 +08:00
mtgu0705
1b0b7810cd Initial int4 moe, compile pass, function not check. 2025-02-18 13:32:25 +08:00
mtgu0705
fba3d780f2 fix bug, function now passes. 2025-02-18 13:25:18 +08:00
mtgu0705
c0ef46ff14 move b thread dequant copy to blockwise. 2025-02-18 13:25:06 +08:00
mtgu0705
a316dff966 fix bug, function pass. 2025-02-18 13:24:48 +08:00
mtgu0705
bee790ec5d init b preshuffle dequant in VGPR. 2025-02-18 13:22:16 +08:00
mtgu0705
9a3f75eeb8 fp8xint4 bpreshuffle function pass 2025-02-18 13:21:36 +08:00
mtgu0705
ba5a6a2477 General fix. 2025-02-18 13:21:06 +08:00
mtgu0705
e0391df785 Added gemm_fp8xint4_Bpreshuffle files, function not checked yet 2025-02-18 13:20:36 +08:00
mtgu0705
8df8a17943 Add Gemm fp8xint4 example and kernel, function pass. 2025-02-18 13:14:18 +08:00
coderfeli
45d1c52ef5 hotfix moegemm2 nswizzle 2025-02-18 04:10:58 +00:00
coderfeli
bca3f14c7c fix nswizzle=0 2025-02-18 03:36:13 +00:00
coderfeli
e78fbf8785 merge 2 moegemm pipe together 2025-02-18 03:23:56 +00:00
coderfeli
1687fc988e chage ktile 2025-02-17 14:26:43 +00:00
coderfeli
4404984abc 2x2 ok 2025-02-17 09:52:22 +00:00
coderfeli
f64b137521 merge haocong branch 2025-02-17 09:30:02 +00:00
coderfeli
88412f9ead impl sorting count eid 2025-02-17 09:11:57 +00:00
coderfeli
4b91d1ce17 revert gemm2 swizz 2025-02-17 06:19:58 +00:00
coderfeli
fcc2c867af impl gemm2 swizzle 2025-02-17 02:33:52 +00:00
coderfeli
aecd6a38e4 rm err print 2025-02-17 01:50:12 +00:00
coderfeli
96047cab6f impl e swizzel 2025-02-17 01:26:42 +00:00
coderfeli
7572a6916c merge develop 2025-02-15 03:23:00 +00:00
coderfeli
7796fc738b fix gemm2 scale, gemm2 ok now 2025-02-15 03:09:47 +00:00
coderfeli
61e3c23851 fix moe gemm2 2025-02-15 01:48:56 +00:00
coderfeli
db53dba4a0 hotfix:gemm1 use real tokens and gemm2 ok 2025-02-14 15:08:28 +00:00
coderfeli
58db931ec5 fix topk id 2025-02-14 09:50:57 +00:00
coderfeli
84b27d7504 merge max_token_id and fix err 2025-02-14 08:19:54 +00:00