Commit Graph

570 Commits

Author SHA1 Message Date
turboderp
23fc4737ae Fast safetensors mode with direct IO and pinned buffer 2024-01-18 20:11:53 +01:00
turboderp
ed3067fee1 Fast safetensors load functions, experimental (not used yet) 2024-01-18 11:02:33 +01:00
turboderp
48b3211d9c Fix for #281 2024-01-17 06:38:52 +01:00
awtrisk
9caa310c94 Merge branch 'turboderp:master' into dynatemp-test 2024-01-15 19:15:26 +05:30
turboderp
10f62c270b Bit of cleanup 2024-01-11 09:03:32 +01:00
turboderp
699d127011 Add metadata when converting .bin to .safetensors 2024-01-10 07:05:07 +01:00
turboderp
7d37b50d90 Fix typos 2024-01-09 07:12:38 +01:00
turboderp
e089313afd Reset norm 2024-01-09 05:30:15 +01:00
turboderp
885c641959 Merge remote-tracking branch 'origin/master' 2024-01-09 05:26:40 +01:00
turboderp
f1cd956aac Merge pull request #264 from josephrocca/patch-1
Fix case where there are no disallowed tokens in `websocket_actions.py`
2024-01-09 05:17:38 +01:00
turboderp
6e214f59c7 Optimize conversion kernels 2024-01-08 03:40:40 +01:00
turboderp
3175f4728d Drop superfluous intermediate states 2024-01-07 15:54:21 +01:00
josephrocca
38bdcfc740 Fix case where there are no disallowed tokens 2024-01-07 20:36:22 +08:00
awtrisk
3b332a8db6 force dynatemp, add basic io header 2024-01-06 14:26:57 +05:30
turboderp
024080186f Util functions for rank-reduce experiment 2024-01-06 09:52:59 +01:00
turboderp
fc1629d209 Increase default VRAM reserve in autosplit slightly 2024-01-06 09:52:30 +01:00
awtrisk
797054b4da Debug statements // to be removed 2024-01-06 14:16:59 +05:30
awtrisk
473efa42eb Add dynatemp support to post_softmax_temperature in sampling.h 2024-01-06 14:14:43 +05:30
awtrisk
6583875f5c add wip dynatemp functionality 2024-01-06 14:07:19 +05:30
turboderp
e1010218a7 Reduce chunk size to reduce likelihood of OoM during ppl test 2024-01-06 07:48:16 +01:00
turboderp
3b0f5230e9 Update model_diff.py to use new attn params 2024-01-06 05:31:34 +01:00
turboderp
26ffee3a20 Merge pull request #259 from bdashore3/qkv-removal
Loras: Remove qkv assertion
2024-01-05 03:17:48 +01:00
kingbri
b9f7f03412 Loras: Remove qkv assertion
QKV embeddings no longer exist in config, so this assertion will
always fire due to config having QKV as None.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-01-04 14:37:25 -05:00
turboderp
41b15dd1c3 Refactor to consolidate attn params 2024-01-04 04:52:49 +01:00
turboderp
f2e7648d98 Fix sin/cos table precalc when alpha/scale == None 2024-01-04 04:51:47 +01:00
turboderp
83b0c2ac3f Remove QKV embeddings 2024-01-02 03:07:12 +01:00
turboderp
45d0ddd402 Fix batch sample 2024-01-02 02:41:50 +01:00
turboderp
66d19b6aa9 CFG support in streaming gen 2024-01-01 23:48:24 +01:00
turboderp
13fe676ac2 Link to bartowski repos 2023-12-31 22:04:33 +01:00
turboderp
4ca4476007 Merge pull request #218 from SinanAkkoyun/safetensor-update
Fixed multi file and wildcard args
2023-12-31 21:28:30 +01:00
turboderp
addab083b6 Merge pull request #251 from eramax/patch-1
add openchat prompt format
2023-12-31 21:27:31 +01:00
turboderp
bdc57362a7 Add minimal chat example 2023-12-31 03:40:56 +01:00
turboderp
5ddf57f945 Fix regular ppl test 2023-12-30 22:14:43 +01:00
turboderp
4d5ef3b53d Attempt to add standard ppl test (experimental) 2023-12-30 01:39:20 +01:00
turboderp
a52d410d4a Attempt to add standard ppl test (experimental) 2023-12-30 01:39:03 +01:00
turboderp
e4d4713757 Allow max_seq_len < max_input_len in load_autosplit 2023-12-30 00:54:57 +01:00
turboderp
4e197f4220 Add ability to load Axolotl checkpoints 2023-12-29 23:20:54 +01:00
Ahmed Morsi
cf92bcb7ee add penchat prompt format 2023-12-29 10:59:34 -08:00
turboderp
970af13551 Fix rope_scale display in convert.py 2023-12-29 00:40:47 +01:00
turboderp
47df040fce Read RoPE linear scale from model config 2023-12-28 23:47:56 +01:00
turboderp
d36077cf92 Fix converter 2023-12-28 10:11:45 +01:00
turboderp
19ac52deab Set default number of experts to None 2023-12-28 06:09:04 +01:00
turboderp
f4fe920a50 Reset snapshot interval 2023-12-27 17:23:58 +01:00
turboderp
02ce583318 Optimize VRAM usage a bit for quantizer 2023-12-26 00:00:37 +01:00
turboderp
7a21396854 Merge branch 'feat/frequency_presence_pen'
# Conflicts:
#	exllamav2/generator/sampler.py
2023-12-25 18:05:02 +01:00
turboderp
dc474c9193 Combine rep_penalty, frequency_penalty and presence_penalty in one function 2023-12-25 18:02:40 +01:00
turboderp
1b02df3d2f Fix default freq_pen and rep_pen 2023-12-25 17:50:02 +01:00
turboderp
f0c516d7c0 Slight tweaks to SD 2023-12-25 16:54:13 +01:00
turboderp
5135f32dfa Merge pull request #167 from MilesQLi/master
Update base.py to remove useless statement
2023-12-24 17:00:06 +01:00
Ivan Sanchez
f8afef97f7 return token probabilities in generator made optional. Change generator examples back to default case 2023-12-24 11:17:29 +00:00