turboderp
23fc4737ae
Fast safetensors mode with direct IO and pinned buffer
2024-01-18 20:11:53 +01:00
turboderp
ed3067fee1
Fast safetensors load functions, experimental (not used yet)
2024-01-18 11:02:33 +01:00
turboderp
48b3211d9c
Fix for #281
2024-01-17 06:38:52 +01:00
awtrisk
9caa310c94
Merge branch 'turboderp:master' into dynatemp-test
2024-01-15 19:15:26 +05:30
turboderp
10f62c270b
Bit of cleanup
2024-01-11 09:03:32 +01:00
turboderp
699d127011
Add metadata when converting .bin to .safetensors
2024-01-10 07:05:07 +01:00
turboderp
7d37b50d90
Fix typos
2024-01-09 07:12:38 +01:00
turboderp
e089313afd
Reset norm
2024-01-09 05:30:15 +01:00
turboderp
885c641959
Merge remote-tracking branch 'origin/master'
2024-01-09 05:26:40 +01:00
turboderp
f1cd956aac
Merge pull request #264 from josephrocca/patch-1
...
Fix case where there are no disallowed tokens in `websocket_actions.py`
2024-01-09 05:17:38 +01:00
turboderp
6e214f59c7
Optimize conversion kernels
2024-01-08 03:40:40 +01:00
turboderp
3175f4728d
Drop superfluous intermediate states
2024-01-07 15:54:21 +01:00
josephrocca
38bdcfc740
Fix case where there are no disallowed tokens
2024-01-07 20:36:22 +08:00
awtrisk
3b332a8db6
force dynatemp, add basic io header
2024-01-06 14:26:57 +05:30
turboderp
024080186f
Util functions for rank-reduce experiment
2024-01-06 09:52:59 +01:00
turboderp
fc1629d209
Increase default VRAM reserve in autosplit slightly
2024-01-06 09:52:30 +01:00
awtrisk
797054b4da
Debug statements // to be removed
2024-01-06 14:16:59 +05:30
awtrisk
473efa42eb
Add dynatemp support to post_softmax_temperature in sampling.h
2024-01-06 14:14:43 +05:30
awtrisk
6583875f5c
add wip dynatemp functionality
2024-01-06 14:07:19 +05:30
turboderp
e1010218a7
Reduce chunk size to reduce likelihood of OoM during ppl test
2024-01-06 07:48:16 +01:00
turboderp
3b0f5230e9
Update model_diff.py to use new attn params
2024-01-06 05:31:34 +01:00
turboderp
26ffee3a20
Merge pull request #259 from bdashore3/qkv-removal
...
Loras: Remove qkv assertion
2024-01-05 03:17:48 +01:00
kingbri
b9f7f03412
Loras: Remove qkv assertion
...
QKV embeddings no longer exist in config, so this assertion will
always fire due to config having QKV as None.
Signed-off-by: kingbri <bdashore3@proton.me >
2024-01-04 14:37:25 -05:00
turboderp
41b15dd1c3
Refactor to consolidate attn params
2024-01-04 04:52:49 +01:00
turboderp
f2e7648d98
Fix sin/cos table precalc when alpha/scale == None
2024-01-04 04:51:47 +01:00
turboderp
83b0c2ac3f
Remove QKV embeddings
2024-01-02 03:07:12 +01:00
turboderp
45d0ddd402
Fix batch sample
2024-01-02 02:41:50 +01:00
turboderp
66d19b6aa9
CFG support in streaming gen
2024-01-01 23:48:24 +01:00
turboderp
13fe676ac2
Link to bartowski repos
2023-12-31 22:04:33 +01:00
turboderp
4ca4476007
Merge pull request #218 from SinanAkkoyun/safetensor-update
...
Fixed multi file and wildcard args
2023-12-31 21:28:30 +01:00
turboderp
addab083b6
Merge pull request #251 from eramax/patch-1
...
add openchat prompt format
2023-12-31 21:27:31 +01:00
turboderp
bdc57362a7
Add minimal chat example
2023-12-31 03:40:56 +01:00
turboderp
5ddf57f945
Fix regular ppl test
2023-12-30 22:14:43 +01:00
turboderp
4d5ef3b53d
Attempt to add standard ppl test (experimental)
2023-12-30 01:39:20 +01:00
turboderp
a52d410d4a
Attempt to add standard ppl test (experimental)
2023-12-30 01:39:03 +01:00
turboderp
e4d4713757
Allow max_seq_len < max_input_len in load_autosplit
2023-12-30 00:54:57 +01:00
turboderp
4e197f4220
Add ability to load Axolotl checkpoints
2023-12-29 23:20:54 +01:00
Ahmed Morsi
cf92bcb7ee
add penchat prompt format
2023-12-29 10:59:34 -08:00
turboderp
970af13551
Fix rope_scale display in convert.py
2023-12-29 00:40:47 +01:00
turboderp
47df040fce
Read RoPE linear scale from model config
2023-12-28 23:47:56 +01:00
turboderp
d36077cf92
Fix converter
2023-12-28 10:11:45 +01:00
turboderp
19ac52deab
Set default number of experts to None
2023-12-28 06:09:04 +01:00
turboderp
f4fe920a50
Reset snapshot interval
2023-12-27 17:23:58 +01:00
turboderp
02ce583318
Optimize VRAM usage a bit for quantizer
2023-12-26 00:00:37 +01:00
turboderp
7a21396854
Merge branch 'feat/frequency_presence_pen'
...
# Conflicts:
# exllamav2/generator/sampler.py
2023-12-25 18:05:02 +01:00
turboderp
dc474c9193
Combine rep_penalty, frequency_penalty and presence_penalty in one function
2023-12-25 18:02:40 +01:00
turboderp
1b02df3d2f
Fix default freq_pen and rep_pen
2023-12-25 17:50:02 +01:00
turboderp
f0c516d7c0
Slight tweaks to SD
2023-12-25 16:54:13 +01:00
turboderp
5135f32dfa
Merge pull request #167 from MilesQLi/master
...
Update base.py to remove useless statement
2023-12-24 17:00:06 +01:00
Ivan Sanchez
f8afef97f7
return token probabilities in generator made optional. Change generator examples back to default case
2023-12-24 11:17:29 +00:00