turboderp
|
85cb54c6f3
|
perf.py: Make sure test context is nontrivial to force more expert diversity
|
2026-03-07 01:18:27 +01:00 |
|
turboderp
|
67785fc286
|
compare_q.py: Paper over some dependency problems
|
2026-03-02 18:47:39 +01:00 |
|
turboderp
|
b2b6f37e12
|
perf.py: Error out if test length > cache size
|
2026-02-17 20:04:13 +01:00 |
|
MikeRoz47
|
52c2f5794d
|
Add optional arg to compare_q to allow it to save plots rather than show them
|
2026-02-15 16:41:18 +00:00 |
|
turboderp
|
428a082276
|
Add performance test
|
2026-01-22 23:28:53 +01:00 |
|
turboderp
|
0d09af403a
|
Diversity test: use greedy sampling for extraction
|
2026-01-14 21:40:31 +01:00 |
|
turboderp
|
e839152802
|
Add diversity test
|
2026-01-11 19:12:04 +01:00 |
|
turboderp
|
6b31fc00f5
|
Add HF tokenizer helper, refactor example
|
2026-01-11 12:49:12 +01:00 |
|
turboderp
|
0a629cf70a
|
HumanEval: Add max batch size arg
|
2025-12-05 13:21:07 +01:00 |
|
turboderp
|
ef8fd43d1c
|
Cleanup unused imports
|
2025-11-16 14:25:46 +01:00 |
|
turboderp
|
38ddd8b9c5
|
MMLU: Fix prompt
|
2025-11-09 22:25:53 +01:00 |
|
turboderp
|
3562dbe7b0
|
compare_q.py: Work around AutoAWQ being broken in later versions of Transformers
|
2025-11-09 13:33:40 +01:00 |
|
turboderp
|
d33aba1845
|
HumanEval: Add MiniMax prompt format
|
2025-10-31 11:50:14 +01:00 |
|
turboderp
|
3634436641
|
model_diff.py: Limit batch size (prevent OoM on output layer)
|
2025-10-29 21:15:35 +01:00 |
|
turboderp
|
aa9a315fd9
|
compare_q.py: Explicit GC between runs
|
2025-09-19 19:15:29 +02:00 |
|
turboderp
|
3845775650
|
ppl_transformers.py: Explicitly make bfloat16 the default dtype
|
2025-09-18 22:11:19 +02:00 |
|
turboderp
|
d8203063dc
|
PPL eval: Transformers FP32 mode
|
2025-09-04 00:39:09 +02:00 |
|
turboderp
|
ca806c3386
|
ppl_transformers.py: Fix input IDs device
|
2025-08-24 21:39:33 +02:00 |
|
turboderp
|
8377460ac6
|
prequant_test.py: Disable torch.compile (conflicts with cudaMallocAsync)
|
2025-08-23 14:55:43 +02:00 |
|
turboderp
|
1f7f3e94c0
|
compare_q.py: Fix/ignore anyprecision imports (Transformers version mismatch)
|
2025-07-16 09:49:18 +02:00 |
|
turboderp
|
5cb70f591b
|
model_diff.py: Add option to save IDs and logits
|
2025-07-15 20:34:10 +02:00 |
|
turboderp
|
4265c9e193
|
Add Transformers ppl test (equivalent to eval/ppl.py)
|
2025-07-15 20:33:42 +02:00 |
|
turboderp
|
c09f809876
|
ppl.py: Add length argument
|
2025-07-15 20:32:47 +02:00 |
|
turboderp
|
a6d79e5d0d
|
MMLU: Random sample option
|
2025-07-12 21:14:56 +02:00 |
|
turboderp
|
415a55cc2d
|
MMLU eval: More feedback during eval
|
2025-07-12 18:31:32 +02:00 |
|
turboderp
|
997ca85bcc
|
Add MMLU eval
|
2025-07-11 13:55:07 +02:00 |
|
turboderp
|
6341b119ef
|
Loader: Add tensor override script
|
2025-07-08 18:58:43 +02:00 |
|
turboderp
|
fce1b96e3f
|
prequant_test.py: Add some more options
|
2025-06-14 15:00:09 +02:00 |
|
turboderp
|
463ebe1841
|
compare_q.py: Add dark mode
|
2025-06-12 05:54:57 +02:00 |
|
turboderp
|
32d98c24c1
|
compare_q.py: Add QTIP wrapper
|
2025-06-08 15:41:30 +02:00 |
|
turboderp
|
f02c9afd6a
|
compare_q_logits.py: Fix bug
|
2025-06-08 15:37:41 +02:00 |
|
turboderp
|
db65151b07
|
compare_q.py: Allow script to run without all backends installed
|
2025-06-05 02:26:00 +02:00 |
|
turboderp
|
162f99ab8b
|
compare_q.py: Add AnyPrecision models
|
2025-06-05 02:26:00 +02:00 |
|
turboderp
|
ab875ba730
|
compare_q.py: Fix GGUF VRAM computation when output.weight precedes token_embd.weight
|
2025-06-04 23:34:42 +02:00 |
|
turboderp
|
8ff65b8742
|
compare_q.py: Option to capture logits in streaming mode (for large unquantized models)
|
2025-05-31 01:11:56 +02:00 |
|
turboderp
|
2cc8f718da
|
Add cosine_error and SQNR measures
|
2025-05-30 19:43:20 +02:00 |
|
turboderp
|
34d2f1f5fa
|
Add prequant_test script
|
2025-05-30 19:42:49 +02:00 |
|
turboderp
|
f8dc9975fe
|
model_diff.py: Add device argument
|
2025-05-30 19:42:49 +02:00 |
|
turboderp
|
c0a2028fb5
|
compare_q.py: Fix some logic for KLD test
|
2025-05-18 21:55:26 +02:00 |
|
turboderp
|
e1d2fa11d6
|
compare_q.py: Add -mask arg
|
2025-05-18 10:58:14 +02:00 |
|
turboderp
|
07ffea7f89
|
compare_q.py: Fix llama.cpp bpw measurement for MoE models
|
2025-05-18 00:19:59 +02:00 |
|
turboderp
|
475dfcca47
|
compare_q.py: Add more GPTQ layer types
|
2025-05-18 00:19:19 +02:00 |
|
turboderp
|
0488385eb0
|
Add simple long-context evaluation script
|
2025-05-17 16:58:12 +02:00 |
|
turboderp
|
3873d40ae2
|
compare_q.py: Add KLD test and some other tweaks
|
2025-05-16 16:13:26 +02:00 |
|
turboderp
|
a19538cf1e
|
compare_q.py: Some fixes
|
2025-05-16 00:33:48 +02:00 |
|
turboderp
|
7f3096ffd7
|
compare_q.py: Account for unquantized weights in blocksparse EXL2 layers
|
2025-05-14 23:55:25 +02:00 |
|
turboderp
|
cb7c70cde0
|
compare_q.py: Add a little versatility to plot
|
2025-05-14 17:52:21 +02:00 |
|
turboderp
|
5c3ff204c4
|
model_diff.py: Use deferred load and close file handles between modules
|
2025-05-12 21:23:48 +02:00 |
|
turboderp
|
1e1754787e
|
HumanEval: Move BOS token to individual prompt template, don't prepend by default when tokenizing
|
2025-05-11 23:02:07 +02:00 |
|
turboderp
|
81a0a7d240
|
Merge pull request #35 from gakada/humaneval
humaneval.py: fix top_k type, remove rep_p, add qwen3
|
2025-05-11 20:47:03 +02:00 |
|