Colin
008a0bb777
Fix converting files with docker command
2024-02-20 19:41:57 -05:00
turboderp
c8e2bf4594
Fix small mistake in example
2024-02-19 14:20:17 +01:00
turboderp
229019d86e
Add lm-format-enforcer JSON example
2024-02-19 00:56:06 +01:00
turboderp
f194d9d7b0
Add filter_prefer_eos option
2024-02-19 00:14:14 +01:00
turboderp
daf7844d18
Add prefix filter
2024-02-18 23:58:25 +01:00
turboderp
26f4bf8997
Make sure first_token is always set when beginning stream (bugfix)
2024-02-18 22:16:01 +01:00
turboderp
8c3b30dc4b
Fix tokenizer decoding for Qwen
2024-02-16 22:53:24 +01:00
turboderp
7af6494afa
Drop device tensors for head layer during conversion
2024-02-16 17:31:19 +01:00
turboderp
5967a29eb4
Fix architecture detection
2024-02-16 01:52:26 +01:00
turboderp
cedeb616ce
Support Qwen2
2024-02-15 20:50:24 +01:00
turboderp
1bc7c85a27
Disambiguate sampling params
2024-02-15 20:04:47 +01:00
turboderp
702dd9740a
VRAM optimizations during quant
2024-02-15 20:03:47 +01:00
turboderp
75f969a6d3
Disable cudaMallocAsync for post2 release
0.0.13.post2
2024-02-15 00:07:47 +01:00
turboderp
0535783ad3
Bump to 0.0.13.post2
2024-02-14 23:54:59 +01:00
turboderp
3424e70cae
Only change allocator if Torch is not already imported
2024-02-14 23:54:48 +01:00
turboderp
c29f42626e
Use cudaMallocAsync allocator by default
2024-02-14 23:18:07 +01:00
turboderp
69bfbea7b1
Allow autosplit to work with cudaMallocAsync backend
2024-02-14 20:19:28 +01:00
turboderp
9c37d64d74
Remove TODO items
2024-02-14 20:00:10 +01:00
turboderp
b0dc588d9b
Remove return values from load_gen
2024-02-14 19:41:59 +01:00
turboderp
1c67f97f3d
New API for streaming generator
2024-02-11 20:31:58 +01:00
turboderp
944e523109
Merge pull request #324 from flying-x/master
...
2 minor changes
2024-02-11 10:09:55 +01:00
turboderp
d7eddbaee0
Optimize typical sampling
2024-02-10 20:13:09 +01:00
turboderp
8639c554ab
Option to return presampling probabilities
2024-02-10 16:18:24 +01:00
turboderp
80ed7a5222
Refactor extension
2024-02-10 12:05:34 +01:00
Min Xu
79402c5c7f
Merge remote-tracking branch 'upstream/master'
2024-02-05 13:56:58 -08:00
Min Xu
9b964c4b28
better gitignore
2024-02-05 13:55:23 -08:00
turboderp
825929af7d
Bump to 0.0.13.post1
0.0.13.post1
2024-02-04 21:46:43 +01:00
turboderp
312f400723
Fix vocab padding in generator
2024-02-04 21:30:37 +01:00
Min Xu
8e13598868
minor changes
...
1. added .so file to the ignored list
2. removed 2 unused imports from test_inference.py, which also
avoided a warning for me that was produced by importing
pandas
2024-02-02 20:19:11 -08:00
turboderp
7feed80cb7
fix typo
v0.0.13
2024-02-02 19:04:14 +01:00
turboderp
efacbc53ca
Use Torch 2.0.1 for cu117
2024-02-02 19:02:01 +01:00
turboderp
f40e8374e9
Bump version to 0.0.13
2024-02-02 18:45:34 +01:00
turboderp
3cd61203c5
Bump Torch to 2.2.0
2024-02-02 18:43:25 +01:00
turboderp
c0ddebaaaf
Update install instructions, remove V1 benchmark
2024-02-02 16:24:19 +01:00
turboderp
480e706342
Merge quad sampling into softmax
...
Combine temperature and smoothing factor (still separate args to sample_basic)
Allow arbitrary exponent
2024-02-02 15:45:11 +01:00
turboderp
700f5a8921
Merge PR #317
2024-02-02 15:40:09 +01:00
turboderp
b60c34770e
Merge quad sampling into softmax
...
Combine temperature and smoothing factor (still separate args to sample_basic)
Allow arbitrary exponent
2024-02-02 15:07:43 +01:00
turboderp
9f8951e63b
More typeable arg shortcut
2024-02-02 15:03:16 +01:00
Alexander Abushady
7c7312d116
Fixed softmax error on quad sampling
2024-02-01 21:35:33 -05:00
Alexander Abushady
3ea67828ea
Quadratic Sampling optimizations
2024-02-01 15:01:12 -05:00
turboderp
0e9d9c1010
Prevent tensors passed to save_file from sharing memory
2024-02-01 10:14:36 +01:00
turboderp
439854c1cf
Remove unused members, default value for embedding
2024-02-01 07:06:09 +01:00
Alexander Abushady
8461e6fa76
Kalomaze's Quadratic Sampling
...
Quadratic Sampling
2024-02-01 00:11:44 -05:00
turboderp
b77028a2eb
Merge PR #296
2024-02-01 06:04:25 +01:00
turboderp
9437f3b3e0
Fix indentation, refactor a bit
2024-02-01 06:02:18 +01:00
turboderp
887719f0fd
Merge remote-tracking branch 'silphendio/silphendio-patch-1'
2024-02-01 05:16:21 +01:00
turboderp
8a0cb9e01d
Add last saved checkpoint to status box
2024-02-01 04:56:33 +01:00
turboderp
4c93ce852f
Fix remaining time estimate
2024-02-01 04:56:00 +01:00
turboderp
735807e800
Use os.replace to swap checkpoint states in measure.py as well
2024-02-01 04:39:34 +01:00
turboderp
a1c8b790f1
Merge branch 'aiconvert'
2024-02-01 04:25:12 +01:00