turboderp
|
0f2da5d6a7
|
GEMM: Lock MCG multiplier to 0xCBAC1FED and MUL1 to 0x83DCD12D. Make MCG the default codebook for new models.
|
2025-10-12 22:09:01 +02:00 |
|
turboderp
|
4829ea43d9
|
Rework GEMM kernel tuning
|
2025-10-05 01:30:20 +02:00 |
|
turboderp
|
1284d43c76
|
GEMM kernel tweaks and tuning
|
2025-05-25 13:33:39 +02:00 |
|
turboderp
|
d359bcc0d3
|
Add MCG 3INST and MCG 1MAD (MUL1) experimental quant modes
|
2025-05-21 19:15:13 +02:00 |
|
turboderp
|
c4867edd0d
|
GEMM kernel optimizations
|
2025-05-08 22:44:26 +02:00 |
|
turboderp
|
5d470a2978
|
Refactor GEMM kernel for cooperative launch and fusions, fuse with input/output Had transforms, retune launch configs, split kernel compile into units
|
2025-05-05 00:54:26 +02:00 |
|
turboderp
|
cf84811485
|
Add cache quantization
|
2025-04-22 21:52:33 +02:00 |
|
turboderp
|
543c4b2771
|
Initial commit
|
2025-04-06 14:42:49 +02:00 |
|