mtgu0705
|
ca281a995c
|
fixed int4 moe tflops calculation.
|
2025-03-04 13:11:14 +08:00 |
|
mtgu0705
|
00f11e8724
|
Revert "revert cmakefiles"
This reverts commit 6b318cb842.
|
2025-03-04 11:43:55 +08:00 |
|
coderfeli
|
55c4f981eb
|
revert cmakefiles
|
2025-03-04 11:12:57 +08:00 |
|
mtgu0705
|
234d70f97d
|
Revert "fix build"
This reverts commit f83e7e138a.
|
2025-03-04 11:12:33 +08:00 |
|
coderfeli
|
11ede06d63
|
fix build
|
2025-03-04 11:11:20 +08:00 |
|
mtgu0705
|
27fb28ed31
|
i4 support lds multiple shuffle
|
2025-03-04 11:06:15 +08:00 |
|
mtgu0705
|
e3a2aa4f9a
|
Updated transfer_v3r1_gather to support pk_i4_t type.
|
2025-03-04 10:27:17 +08:00 |
|
coderfeli
|
41c3a70577
|
fix int4 moe
|
2025-03-01 15:19:03 +00:00 |
|
mtgu0705
|
d033ed5ec5
|
revert the v3r1_gather.hpp
|
2025-02-28 19:09:23 +08:00 |
|
mtgu0705
|
798af40467
|
remove some values between mfma instructions.
|
2025-02-28 14:56:21 +08:00 |
|
mtgu0705
|
29d9235bf3
|
for int4 moe2 for type_convert support.
|
2025-02-26 17:04:17 +08:00 |
|
mtgu0705
|
dd178ef79d
|
Updated transfer_v3r1_gather to support pk_i4_t type.
|
2025-02-26 16:26:06 +08:00 |
|
mtgu0705
|
29d8d28bbb
|
commit a version for compiler debug.
|
2025-02-24 14:44:02 +08:00 |
|
mtgu0705
|
04ff8111d8
|
Revert "fix nswizzle = true"
This reverts commit 24d8024f0e.
|
2025-02-19 16:42:06 +08:00 |
|
coderfeli
|
2fca1921b8
|
fix nswizzle = true
|
2025-02-19 16:20:34 +08:00 |
|
mtgu0705
|
11eaac1932
|
enable pipeline v3.
|
2025-02-19 14:22:39 +08:00 |
|
mtgu0705
|
77e2385f0e
|
update tile size.
|
2025-02-19 09:56:37 +08:00 |
|
mtgu0705
|
fc1558e354
|
update int4 moe with latest input changes.
|
2025-02-18 18:09:19 +08:00 |
|
coderfeli
|
9ff2394e26
|
fix swizzle = false
|
2025-02-18 17:43:00 +08:00 |
|
coderfeli
|
a61084f432
|
opt gemm2 to 2x2 wave
|
2025-02-18 17:42:46 +08:00 |
|
mtgu0705
|
854cd8b4a1
|
commit missing files
|
2025-02-18 16:29:26 +08:00 |
|
mtgu0705
|
182e7480ba
|
Split the blockwise pipeline for fp8xint4.
|
2025-02-18 15:38:05 +08:00 |
|
mtgu0705
|
966f9051c7
|
fixed merge issue. fp8xint4 and fp8xint4_bpreshuffle function pass.
|
2025-02-18 13:54:15 +08:00 |
|
mtgu0705
|
49bac8cef7
|
Added b preshuffle pipeline v3 support.
|
2025-02-18 13:34:55 +08:00 |
|
mtgu0705
|
a0432459e7
|
Added moe_pk_i4_gemm2, function pass.
|
2025-02-18 13:34:41 +08:00 |
|
mtgu0705
|
a09f038c68
|
test expert = 8 and function pass.
|
2025-02-18 13:32:58 +08:00 |
|
mtgu0705
|
be79b63bfe
|
fix bug in moe_gemm1.cpp, now function pass.
|
2025-02-18 13:32:46 +08:00 |
|
mtgu0705
|
1b0b7810cd
|
Initial int4 moe, compile pass, function not check.
|
2025-02-18 13:32:25 +08:00 |
|
mtgu0705
|
e420767e3a
|
modified the tile size to 256, 128x128x128.
|
2025-02-18 13:25:30 +08:00 |
|
mtgu0705
|
fba3d780f2
|
fix bug, function now passes.
|
2025-02-18 13:25:18 +08:00 |
|
mtgu0705
|
c0ef46ff14
|
move b thread dequant copy to blockwise.
|
2025-02-18 13:25:06 +08:00 |
|
mtgu0705
|
a316dff966
|
fix bug, function pass.
|
2025-02-18 13:24:48 +08:00 |
|
mtgu0705
|
bee790ec5d
|
init b preshuffle dequant in VGPR.
|
2025-02-18 13:22:16 +08:00 |
|
mtgu0705
|
ed89a238a0
|
fix.
|
2025-02-18 13:22:03 +08:00 |
|
mtgu0705
|
9a3f75eeb8
|
fp8xint4 bpreshuffle function pass
|
2025-02-18 13:21:36 +08:00 |
|
mtgu0705
|
ba5a6a2477
|
General fix.
|
2025-02-18 13:21:06 +08:00 |
|
mtgu0705
|
e0391df785
|
Added gemm_fp8xint4_Bpreshuffle files, function not checked yet
|
2025-02-18 13:20:36 +08:00 |
|
mtgu0705
|
2559ef64c3
|
Init Gemm_fp8xint4 Bpreshuffle
|
2025-02-18 13:15:35 +08:00 |
|
mtgu0705
|
8df8a17943
|
Add Gemm fp8xint4 example and kernel, function pass.
|
2025-02-18 13:14:18 +08:00 |
|
coderfeli
|
45d1c52ef5
|
hotfix moegemm2 nswizzle
|
2025-02-18 04:10:58 +00:00 |
|
coderfeli
|
bca3f14c7c
|
fix nswizzle=0
|
2025-02-18 03:36:13 +00:00 |
|
coderfeli
|
e78fbf8785
|
merge 2 moegemm pipe together
|
2025-02-18 03:23:56 +00:00 |
|
coderfeli
|
1687fc988e
|
chage ktile
|
2025-02-17 14:26:43 +00:00 |
|
coderfeli
|
4404984abc
|
2x2 ok
|
2025-02-17 09:52:22 +00:00 |
|
coderfeli
|
f64b137521
|
merge haocong branch
|
2025-02-17 09:30:02 +00:00 |
|
coderfeli
|
88412f9ead
|
impl sorting count eid
|
2025-02-17 09:11:57 +00:00 |
|
coderfeli
|
4b91d1ce17
|
revert gemm2 swizz
|
2025-02-17 06:19:58 +00:00 |
|
coderfeli
|
fcc2c867af
|
impl gemm2 swizzle
|
2025-02-17 02:33:52 +00:00 |
|
coderfeli
|
aecd6a38e4
|
rm err print
|
2025-02-17 01:50:12 +00:00 |
|
coderfeli
|
96047cab6f
|
impl e swizzel
|
2025-02-17 01:26:42 +00:00 |
|