turboderp
|
96ba966ad9
|
Bump to v0.0.20
v0.0.20
|
2026-01-19 23:21:59 +01:00 |
|
turboderp
|
0ecc37bf97
|
Fix ComboSampler init when initializing as greedy
|
2026-01-19 22:57:19 +01:00 |
|
turboderp
|
75ee2c78c3
|
Add Qwen2_5_VLForConditionalGeneration, refactor HCXVisionV2VisionModel as subclass of Qwen2_5VLVisionModel
|
2026-01-19 22:48:49 +01:00 |
|
turboderp
|
5a6975747f
|
Bump to v0.0.19
v0.0.19
|
2026-01-16 23:28:09 +01:00 |
|
turboderp
|
c39616a7b5
|
Merge pull request #125 from amanwalksdownthestreet/fix-arch-suffix-parsing
arch_list: Strip NVIDIA arch suffixes (sm_120a, sm_90a, etc.)
|
2026-01-14 22:11:43 +01:00 |
|
turboderp
|
f21b92e978
|
Add Adaptive-P sampler
|
2026-01-14 21:58:34 +01:00 |
|
turboderp
|
0d09af403a
|
Diversity test: use greedy sampling for extraction
|
2026-01-14 21:40:31 +01:00 |
|
turboderp
|
e839152802
|
Add diversity test
|
2026-01-11 19:12:04 +01:00 |
|
turboderp
|
3186dca9da
|
generator: Pad token mask when output layer is padded
|
2026-01-11 19:11:26 +01:00 |
|
turboderp
|
9043690801
|
generator: Free recurrent state after job completed (prevent memory leak with large job queue)
|
2026-01-11 17:38:15 +01:00 |
|
turboderp
|
e69d91b12b
|
model_init: Add sampling args default overrides
|
2026-01-11 16:38:33 +01:00 |
|
turboderp
|
6b31fc00f5
|
Add HF tokenizer helper, refactor example
|
2026-01-11 12:49:12 +01:00 |
|
turboderp
|
288a98f5e3
|
Refactor sampler args for examples
|
2026-01-11 12:33:27 +01:00 |
|
turboderp
|
27c68d4e65
|
Update README.md
|
2026-01-10 15:59:46 +01:00 |
|
turboderp
|
539410a2a3
|
Support NanoChatForCausalLM
|
2026-01-10 15:59:08 +01:00 |
|
turboderp
|
3ecb9f54fb
|
Merge pull request #136 from mindkrypted/feature/support-solar-open-moe
Add support for SolarOpenMoE architecture
|
2026-01-10 10:55:36 +01:00 |
|
mindkrypted
|
fd8659a6c3
|
Add support for SolarOpenMoE architecture
|
2026-01-07 13:45:23 -05:00 |
|
turboderp
|
703b05ab52
|
Update README.md
|
2026-01-06 16:08:23 +01:00 |
|
turboderp
|
a17d1a4334
|
Add HCXVisionV2ForCausalLM architecture
|
2026-01-06 16:01:54 +01:00 |
|
turboderp
|
7de8641fce
|
Attn: Add varlen mode
|
2026-01-06 16:01:54 +01:00 |
|
turboderp
|
a026b32df3
|
Support IQuestCoderForCausalLM
|
2026-01-04 12:31:58 +01:00 |
|
turboderp
|
6e75e7b151
|
chat.py: Fix for models with eos_token_id=null
|
2026-01-04 02:02:10 +01:00 |
|
turboderp
|
227621e49e
|
Support HyperCLOVAXForCausalLM
|
2026-01-03 03:22:50 +01:00 |
|
turboderp
|
a92cf0a13a
|
Attn: Support custom softmax scale in SDPA mode
|
2026-01-03 03:22:13 +01:00 |
|
turboderp
|
cff5fd542c
|
Embedding: Support embedding multiplier
|
2026-01-03 03:21:55 +01:00 |
|
turboderp
|
452803e73d
|
Olmo3: Use default RoPE type for SWA layers
|
2025-12-26 21:38:56 +01:00 |
|
turboderp
|
195d01657a
|
RoPE: Allow RoPE type override
|
2025-12-26 21:38:34 +01:00 |
|
turboderp
|
e8b77bba4a
|
chat.py: Fix prompt tokens/s display
|
2025-12-25 23:18:50 +01:00 |
|
turboderp
|
80907797a5
|
chat.py: Add debug mode
|
2025-12-25 23:18:25 +01:00 |
|
turboderp
|
f0ea2ca858
|
Linear: Support new FP8 scale format
|
2025-12-23 21:05:05 +01:00 |
|
turboderp
|
2698a83022
|
RoPE: Let arch override theta key name
|
2025-12-23 21:04:41 +01:00 |
|
amanwalksdownthestreet
|
65cfaf3c60
|
arch_list: Strip NVIDIA arch suffixes (sm_120a, sm_90a, etc.)
|
2025-12-16 23:06:34 -07:00 |
|
turboderp
|
a32e2219af
|
Allow -hb 16 while quantizing
|
2025-12-13 20:55:25 +01:00 |
|
turboderp
|
104268521c
|
Support Olmo3ForCausalLM
|
2025-12-13 20:49:03 +01:00 |
|
turboderp
|
bd0f26cd0e
|
Fix comments
|
2025-12-10 21:47:30 +01:00 |
|
turboderp
|
1b7009c5b8
|
Merge remote-tracking branch 'origin/master'
v0.0.18
|
2025-12-10 10:43:17 +01:00 |
|
turboderp
|
f9d0e6038f
|
Bump to v0.0.18
|
2025-12-10 10:42:41 +01:00 |
|
turboderp
|
d8be5d638f
|
chat.py: Read all stop conditions from config.json
|
2025-12-10 00:53:45 +01:00 |
|
turboderp
|
9b75bc5f58
|
Support Ministral3ForCausalLM
|
2025-12-10 00:53:22 +01:00 |
|
turboderp
|
9663357c4f
|
Convert: Print some more RoPE debug info
|
2025-12-10 00:52:49 +01:00 |
|
turboderp
|
24caf2c762
|
RoPE: Accept partial_rotary_factor in rope_parameters
|
2025-12-10 00:52:29 +01:00 |
|
kingbri
|
e49c02a3aa
|
Actions: Add builds for torch 2.9
Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>
|
2025-12-09 15:03:25 -05:00 |
|
turboderp
|
4d4992a8b8
|
GLM4: Update config parser to support 4.6V
|
2025-12-08 21:39:42 +01:00 |
|
turboderp
|
1385486592
|
Bump to v0.0.17
v0.0.17
|
2025-12-07 17:47:20 +01:00 |
|
turboderp
|
784d3dc7e7
|
GEMM: Optimize reduction a little bit
|
2025-12-06 01:56:21 +01:00 |
|
turboderp
|
15b9c2b421
|
Cleanup
|
2025-12-06 01:55:56 +01:00 |
|
turboderp
|
700b34695f
|
Generator: Fix #118, make sure prepare_logit_mask is only called on jobs in the sample batch.
Thanks to @EthanAndersonUSA
|
2025-12-05 16:29:13 +01:00 |
|
turboderp
|
dc654cf4d8
|
MoE Routing kernel: Allow num_experts not divisible by 32
|
2025-12-05 13:22:39 +01:00 |
|
turboderp
|
0a629cf70a
|
HumanEval: Add max batch size arg
|
2025-12-05 13:21:07 +01:00 |
|
turboderp
|
bb43823e32
|
Mistral3: Try to load preprocessor config from processor_config.json if preprocessor_config.json not present
|
2025-12-03 19:10:30 +01:00 |
|