Commit Graph

1890 Commits

Author SHA1 Message Date
mtgu0705
ca281a995c fixed int4 moe tflops calculation. 2025-03-04 13:11:14 +08:00
mtgu0705
00f11e8724 Revert "revert cmakefiles"
This reverts commit 6b318cb842.
2025-03-04 11:43:55 +08:00
coderfeli
55c4f981eb revert cmakefiles 2025-03-04 11:12:57 +08:00
mtgu0705
234d70f97d Revert "fix build"
This reverts commit f83e7e138a.
2025-03-04 11:12:33 +08:00
coderfeli
11ede06d63 fix build 2025-03-04 11:11:20 +08:00
mtgu0705
27fb28ed31 i4 support lds multiple shuffle 2025-03-04 11:06:15 +08:00
mtgu0705
e3a2aa4f9a Updated transfer_v3r1_gather to support pk_i4_t type. 2025-03-04 10:27:17 +08:00
coderfeli
41c3a70577 fix int4 moe 2025-03-01 15:19:03 +00:00
mtgu0705
d033ed5ec5 revert the v3r1_gather.hpp 2025-02-28 19:09:23 +08:00
mtgu0705
798af40467 remove some values between mfma instructions. 2025-02-28 14:56:21 +08:00
mtgu0705
29d9235bf3 for int4 moe2 for type_convert support. 2025-02-26 17:04:17 +08:00
mtgu0705
dd178ef79d Updated transfer_v3r1_gather to support pk_i4_t type. 2025-02-26 16:26:06 +08:00
mtgu0705
29d8d28bbb commit a version for compiler debug. 2025-02-24 14:44:02 +08:00
mtgu0705
04ff8111d8 Revert "fix nswizzle = true"
This reverts commit 24d8024f0e.
2025-02-19 16:42:06 +08:00
coderfeli
2fca1921b8 fix nswizzle = true 2025-02-19 16:20:34 +08:00
mtgu0705
11eaac1932 enable pipeline v3. 2025-02-19 14:22:39 +08:00
mtgu0705
77e2385f0e update tile size. 2025-02-19 09:56:37 +08:00
mtgu0705
fc1558e354 update int4 moe with latest input changes. 2025-02-18 18:09:19 +08:00
coderfeli
9ff2394e26 fix swizzle = false 2025-02-18 17:43:00 +08:00
coderfeli
a61084f432 opt gemm2 to 2x2 wave 2025-02-18 17:42:46 +08:00
mtgu0705
854cd8b4a1 commit missing files 2025-02-18 16:29:26 +08:00
mtgu0705
182e7480ba Split the blockwise pipeline for fp8xint4. 2025-02-18 15:38:05 +08:00
mtgu0705
966f9051c7 fixed merge issue. fp8xint4 and fp8xint4_bpreshuffle function pass. 2025-02-18 13:54:15 +08:00
mtgu0705
49bac8cef7 Added b preshuffle pipeline v3 support. 2025-02-18 13:34:55 +08:00
mtgu0705
a0432459e7 Added moe_pk_i4_gemm2, function pass. 2025-02-18 13:34:41 +08:00
mtgu0705
a09f038c68 test expert = 8 and function pass. 2025-02-18 13:32:58 +08:00
mtgu0705
be79b63bfe fix bug in moe_gemm1.cpp, now function pass. 2025-02-18 13:32:46 +08:00
mtgu0705
1b0b7810cd Initial int4 moe, compile pass, function not check. 2025-02-18 13:32:25 +08:00
mtgu0705
e420767e3a modified the tile size to 256, 128x128x128. 2025-02-18 13:25:30 +08:00
mtgu0705
fba3d780f2 fix bug, function now passes. 2025-02-18 13:25:18 +08:00
mtgu0705
c0ef46ff14 move b thread dequant copy to blockwise. 2025-02-18 13:25:06 +08:00
mtgu0705
a316dff966 fix bug, function pass. 2025-02-18 13:24:48 +08:00
mtgu0705
bee790ec5d init b preshuffle dequant in VGPR. 2025-02-18 13:22:16 +08:00
mtgu0705
ed89a238a0 fix. 2025-02-18 13:22:03 +08:00
mtgu0705
9a3f75eeb8 fp8xint4 bpreshuffle function pass 2025-02-18 13:21:36 +08:00
mtgu0705
ba5a6a2477 General fix. 2025-02-18 13:21:06 +08:00
mtgu0705
e0391df785 Added gemm_fp8xint4_Bpreshuffle files, function not checked yet 2025-02-18 13:20:36 +08:00
mtgu0705
2559ef64c3 Init Gemm_fp8xint4 Bpreshuffle 2025-02-18 13:15:35 +08:00
mtgu0705
8df8a17943 Add Gemm fp8xint4 example and kernel, function pass. 2025-02-18 13:14:18 +08:00
coderfeli
45d1c52ef5 hotfix moegemm2 nswizzle 2025-02-18 04:10:58 +00:00
coderfeli
bca3f14c7c fix nswizzle=0 2025-02-18 03:36:13 +00:00
coderfeli
e78fbf8785 merge 2 moegemm pipe together 2025-02-18 03:23:56 +00:00
coderfeli
1687fc988e chage ktile 2025-02-17 14:26:43 +00:00
coderfeli
4404984abc 2x2 ok 2025-02-17 09:52:22 +00:00
coderfeli
f64b137521 merge haocong branch 2025-02-17 09:30:02 +00:00
coderfeli
88412f9ead impl sorting count eid 2025-02-17 09:11:57 +00:00
coderfeli
4b91d1ce17 revert gemm2 swizz 2025-02-17 06:19:58 +00:00
coderfeli
fcc2c867af impl gemm2 swizzle 2025-02-17 02:33:52 +00:00
coderfeli
aecd6a38e4 rm err print 2025-02-17 01:50:12 +00:00
coderfeli
96047cab6f impl e swizzel 2025-02-17 01:26:42 +00:00