Commit Graph

1290 Commits

Author SHA1 Message Date
TerminalMan
d92ff8d9e4 improve installation experience (#666) 2024-11-02 21:11:14 +01:00
Brian Dashore
84b1f9017d Torch 2.5 (#659)
* Actions: Add helpful comments

Useful for updating dependencies when building.

Signed-off-by: kingbri <bdashore3@proton.me>

* Actions: Add torch 2.5 builds

Signed-off-by: kingbri <bdashore3@proton.me>

---------

Signed-off-by: kingbri <bdashore3@proton.me>
2024-11-02 21:09:23 +01:00
turboderp
9cd077dc96 Fix regression 2024-10-20 21:25:05 +02:00
turboderp
a8d8a41dc4 Add multimodal experiment 2024-10-20 21:21:21 +02:00
turboderp
0347b062bf Add indexed embeddings support to dynamic gen 2024-10-20 21:21:07 +02:00
turboderp
1f35150e81 Fix thread-local device setup in safetensors loader, fix for #647 2024-10-15 20:42:18 +02:00
Valeriy Selitskiy
e55c4ad283 feat: try to create out_dir if it does not exist (#654) 2024-10-15 19:30:33 +02:00
turboderp
a40c07a333 Update Formatron example (supports conlist since 0.4.6) 2024-10-15 19:28:00 +02:00
turboderp
acccc930cc Don't yield thread early for background filter evaluation (benchmarks slightly faster in some cases) 2024-10-03 00:00:45 +02:00
turboderp
7bacab2a55 Rename JSON example 2024-10-02 23:59:53 +02:00
turboderp
ed6dc9b7b3 Add some debug functions 2024-10-02 23:58:33 +02:00
turboderp
b651f4abab Add XTC and DRY options to chatbot example. 2024-10-02 00:01:49 +02:00
turboderp
2616fd74d0 Add Formatron example 2024-09-30 00:41:51 +02:00
turboderp
22cbff66cf Add logit masking mode to filters 2024-09-30 00:35:15 +02:00
turboderp
1b580ce15f Sampling: Fix inefficient top-K when most probs are zero 2024-09-30 00:32:59 +02:00
turboderp
03b2d551b2 Bump to v0.2.3 v0.2.3 2024-09-29 13:00:18 +02:00
turboderp
cad7848375 HumanEval: Rename new args to match other scripts 2024-09-29 12:57:06 +02:00
turboderp
ef7cdda31c Merge remote-tracking branch 'refs/remotes/LlamaEnjoyer/add_more_args_to_humaneval' into dev 2024-09-29 12:52:52 +02:00
turboderp
5d4359317d Add YaRN factor override to model_init 2024-09-29 12:35:45 +02:00
turboderp
c84f5979c8 Merge branch 'refs/heads/dev-yarn' into dev 2024-09-29 12:21:22 +02:00
turboderp
f1adff9472 Fix multi-token character decoding for Qwen2 (legacy gen) 2024-09-29 00:15:05 +02:00
turboderp
431479207f Fix multi-token character decoding for Qwen2 2024-09-28 23:47:07 +02:00
turboderp
be3de0fa85 Add some code for evaluating FPx (not enabled) 2024-09-28 16:07:39 +02:00
turboderp
d393bfe4a7 Merge remote-tracking branch 'origin/dev' into dev 2024-09-28 16:05:01 +02:00
Downtown-Case
6b73184d4f Use specified max context.py
Instead of original_max_position_embeddings.

This appears to be what transformers intended, and does not update dynamically with sequence leng there.
2024-09-27 23:31:53 -04:00
Downtown-Case
8dca1abf44 Only trigger if long context config is set 2024-09-27 19:44:49 -04:00
Downtown-Case
b1955039c6 Pesky space.py 2024-09-27 19:06:05 -04:00
Downtown-Case
aff1e5a547 Add YaRN 2024-09-27 18:57:15 -04:00
Downtown-Case
0d78f034b1 Add YaRN 2024-09-27 18:53:22 -04:00
Llama Enjoyer
b2af0bbad3 Remove stray import. 2024-09-24 17:32:09 +02:00
Llama Enjoyer
3a389131de Add more arguments to accept values passed via the cmd line. 2024-09-24 17:28:02 +02:00
Llama Enjoyer
e960dfd68d Fix the temperature argument to accept values passed via the cmd line. 2024-09-24 17:18:08 +02:00
Sinan
7c7b1993b4 Added draft token count as parameter to chat.py (#635) 2024-09-24 11:16:30 +02:00
turboderp
8361f3f4a0 Add missing cp310+cu118 torch 2.4 windows wheel 2024-09-23 17:38:53 +02:00
turboderp
15e54046ba More stream gymnastics 2024-09-23 17:28:55 +02:00
turboderp
a5132d072e Add XTC sampler 2024-09-22 23:09:19 +02:00
turboderp
6d7b2e8e7a Revert snapshot interval 2024-09-22 19:11:32 +02:00
turboderp
43a0be35df Make measurement less sensitive to very sparse inf values in reference fwd pass 2024-09-22 19:01:29 +02:00
turboderp
a17f6665cb Fix streams in quantizer 2024-09-22 18:15:05 +02:00
turboderp
9946f45f1c Force tensor loading onto priority stream 2024-09-20 22:05:20 +02:00
turboderp
e155e0a5b0 Fix loading in new thread 2024-09-18 19:46:56 +02:00
turboderp
c4a03e09f5 Tokenizer: Give priority to tokenizer.json instead of tokenizer.model 2024-09-18 00:41:43 +02:00
turboderp
12bceb9f4b Cleanup 2024-09-18 00:32:00 +02:00
turboderp
0695f3a854 Fix potential bug in filter evaluation 2024-09-17 00:34:28 +02:00
turboderp
8a25e0f2b3 Merge branch 'refs/heads/master' into dev 2024-09-17 00:33:36 +02:00
turboderp
b25210778c Remove fasttensors, add platform-agnostic multithreaded ST loader 2024-09-17 00:33:16 +02:00
turboderp
144c576bdb Fix bottlenecks in quantized tensor loading 2024-09-17 00:00:27 +02:00
turboderp
10a8842b25 Fix JSON inference example 2024-09-14 21:35:02 +02:00
turboderp
b2c7cf280c Add cp310 cu121 torch2.4 Windows wheel v0.2.2 2024-09-14 21:17:52 +02:00
turboderp
46eff43403 Merge branch 'refs/heads/dev' 2024-09-14 21:13:46 +02:00