turboderp
|
2965eec919
|
GatedDeltaNet: Skip redundant zeroing of buffers (Qwen3-Next)
|
2026-03-03 05:01:16 +01:00 |
|
turboderp
|
410a43df22
|
GatedDeltaNet: Increase max no. K/V heads
|
2026-03-03 04:55:55 +01:00 |
|
turboderp
|
725a75386d
|
RMSNorm/GatedRMSNorm: Tidy up launch logic with macros and add more dtypes
|
2026-03-02 22:22:47 +01:00 |
|
turboderp
|
67785fc286
|
compare_q.py: Paper over some dependency problems
|
2026-03-02 18:47:39 +01:00 |
|
turboderp
|
5cb91c5505
|
GatedDeltaNet: Fix output projection no. input features
|
2026-03-02 16:35:26 +01:00 |
|
turboderp
|
e12e6bd759
|
Update README.md
|
2026-03-02 15:51:58 +01:00 |
|
lesj0610
|
88062566f5
|
Qwen3.5: Smoke test
|
2026-03-02 15:49:29 +01:00 |
|
turboderp
|
021b027728
|
Qwen3.5: Enable MRoPE, update multimodal example
|
2026-03-02 05:27:33 +01:00 |
|
lesj0610
|
0ca0d6ac01
|
Add Qwen3_5ForConditionalGeneration and Qwen3_5MoeForConditionalGeneration
|
2026-03-02 05:22:08 +01:00 |
|
lesj0610
|
390624ab3c
|
convert.py: Better ETA calculation
|
2026-03-02 04:23:51 +01:00 |
|
turboderp
|
88dcdf782d
|
Update README.md
|
2026-03-02 03:49:28 +01:00 |
|
turboderp
|
93695e9a7d
|
RMSNorm/RoPE kernels: Allow BF16/FP32 norm weights
|
2026-03-02 03:49:13 +01:00 |
|
turboderp
|
e2f4198406
|
Formatting
|
2026-03-02 00:53:23 +01:00 |
|
turboderp
|
08ca454ec0
|
Step 3.5: Fix TP split
|
2026-03-01 21:32:59 +01:00 |
|
turboderp
|
6386de7a9b
|
Add Step3p5ForCausalLM
|
2026-03-01 17:59:28 +01:00 |
|
turboderp
|
76937421ec
|
convert.py: Make out_scales the default, with options for auto and disable
|
2026-03-01 17:57:55 +01:00 |
|
turboderp
|
c8c2e6178c
|
chat.py: Catbench shortcut
|
2026-03-01 17:57:55 +01:00 |
|
turboderp
|
99f792dce0
|
Add custom activation limits
|
2026-03-01 17:57:55 +01:00 |
|
turboderp
|
b272ea3515
|
Remove C-style conditionals
|
2026-03-01 15:12:33 +01:00 |
|
turboderp
|
18b2a23d8a
|
chat.py: Fix error message
|
2026-03-01 15:10:22 +01:00 |
|
turboderp
|
b0cfe46702
|
Config: Allow for interpreting config key with incorrect data type as missing key (for weirdly implemented layerwise RoPE settings in some models)
|
2026-03-01 03:16:32 +01:00 |
|
turboderp
|
489b3aab12
|
BlockSparseMLP: Allow loading combined experts tensors also when gate and up are not fused
|
2026-03-01 03:13:56 +01:00 |
|
turboderp
|
4bdd22ea77
|
BlockSparseMLP: Make sure bias is always applied during calibration
|
2026-03-01 03:13:03 +01:00 |
|
turboderp
|
f7ccb524e7
|
Attn: Support headwise gate
|
2026-03-01 03:12:03 +01:00 |
|
turboderp
|
447c8bb522
|
Build actions: Add torch 2.10.0 wheels
|
2026-02-28 23:53:15 +01:00 |
|
turboderp
|
8ef7f4b5dd
|
Linear: Allow fusing linear layers during unquantized model load
|
2026-02-22 22:43:34 +01:00 |
|
turboderp
|
c1b16d2fc9
|
Loader: Allow checking for lists of tensor groups
|
2026-02-22 22:42:30 +01:00 |
|
turboderp
|
ea1fe0ccea
|
Cleanup
|
2026-02-22 15:14:57 +01:00 |
|
turboderp
|
ed5bad7235
|
Alias __nv_bfloat16 -> bfloat16
|
2026-02-17 21:24:41 +01:00 |
|
turboderp
|
b2b6f37e12
|
perf.py: Error out if test length > cache size
|
2026-02-17 20:04:13 +01:00 |
|
turboderp
|
3f9c053227
|
Merge pull request #141
Add tensor parallel support for MiniMax M2 Q/K norms
|
2026-02-16 01:24:34 +01:00 |
|
turboderp
|
abb083ceb8
|
Merge pull request #103 from mratsim/patch-1
Add size estimation script for model tensors size
|
2026-02-15 17:58:50 +01:00 |
|
turboderp
|
ae3645c455
|
Merge pull request #147 from lesj0610/feat/hf-chat-template-compat
Tokenizer: robust HF chat template kwargs and output compatibility
|
2026-02-15 17:58:03 +01:00 |
|
turboderp
|
eca621af79
|
Merge remote-tracking branch 'origin/dev' into dev
|
2026-02-15 17:56:31 +01:00 |
|
turboderp
|
1744361cc2
|
Merge pull request #148 from lesj0610/fix/exaone4-swa-layer-types
exaone4: use layer_types as source of truth for SWA layer mapping
|
2026-02-15 17:55:08 +01:00 |
|
turboderp
|
44f70da0f9
|
Merge pull request #149 from MikeRoz47/dev
Add optional arg to compare_q.py for saving plot files
|
2026-02-15 17:53:55 +01:00 |
|
MikeRoz47
|
52c2f5794d
|
Add optional arg to compare_q to allow it to save plots rather than show them
|
2026-02-15 16:41:18 +00:00 |
|
lesj0610
|
5c076e5f2a
|
exaone4: prefer layer_types over pattern for SWA layer mapping
|
2026-02-12 01:48:52 +09:00 |
|
lesj0610
|
019d965eb6
|
tokenizer: harden HF chat template compatibility and kwargs passthrough
|
2026-02-12 01:25:30 +09:00 |
|
turboderp
|
701afb9294
|
Bump to v0.0.22
v0.0.22
|
2026-02-10 17:48:24 +01:00 |
|
turboderp
|
89b841dd8a
|
safetensors_alt: Allow writing bfloat16 tensors
|
2026-02-10 17:47:44 +01:00 |
|
turboderp
|
6e4202eade
|
Bump to v0.0.21
v0.0.21
|
2026-02-09 22:19:02 +01:00 |
|
turboderp
|
f9a7448366
|
Merge branch 'refs/heads/st_test' into dev
|
2026-02-09 04:35:00 +01:00 |
|
turboderp
|
d3e02500e0
|
Sigmoid+proj kernel: fix regression (Qwen3-Next)
|
2026-02-09 04:34:37 +01:00 |
|
turboderp
|
d85690204a
|
Replacement safetensors lib for quantization
|
2026-01-27 00:52:54 +01:00 |
|
turboderp
|
428a082276
|
Add performance test
|
2026-01-22 23:28:53 +01:00 |
|
turboderp
|
91a11853cd
|
Update README.md
|
2026-01-22 23:27:23 +01:00 |
|
turboderp
|
96ba966ad9
|
Bump to v0.0.20
v0.0.20
|
2026-01-19 23:21:59 +01:00 |
|
turboderp
|
0ecc37bf97
|
Fix ComboSampler init when initializing as greedy
|
2026-01-19 22:57:19 +01:00 |
|
turboderp
|
75ee2c78c3
|
Add Qwen2_5_VLForConditionalGeneration, refactor HCXVisionV2VisionModel as subclass of Qwen2_5VLVisionModel
|
2026-01-19 22:48:49 +01:00 |
|