Thomas
eaf8ad1041
Update chat.py, include multi-line input support and context clearing through input ( #738 )
...
* Update chat.py, include multi-line input support and context clearing
- Enable multi-line input (mli) support through the -mli argument. When using mli, end input with the EOF char (return/Ctrl+D on Unix, return/Ctrl+Z/return on Windows)
- Allow context clearing outside of amnesia by inputting "clear"
* Adding qwq chat mode, adding the ability to forget thinking context
2025-03-10 15:28:33 +01:00
turboderp
cf7fcd18d2
Fix chat example system prompt
2024-12-18 07:52:09 +01:00
turboderp
48e6306193
Update chat example, prompt formats
2024-11-30 13:31:35 +01:00
turboderp
d37cf7e764
Fix regressions
2024-11-10 13:38:21 +01:00
turboderp
b651f4abab
Add XTC and DRY options to chatbot example.
2024-10-02 00:01:49 +02:00
Sinan
7c7b1993b4
Added draft token count as parameter to chat.py ( #635 )
2024-09-24 11:16:30 +02:00
turboderp
4117daa546
Cleanup
2024-08-22 12:49:37 +02:00
turboderp
8477da8f8c
Chatbot: Load draft model first
2024-08-15 18:42:37 +02:00
turboderp
b30f796690
TP mode for attn layer, non-paged
2024-08-14 23:41:10 +02:00
turboderp
95e093a2b2
Chatbot: Ignore undefined special tokens
2024-07-03 22:34:34 +02:00
turboderp
8c2132453c
More debug output
2024-07-03 22:04:22 +02:00
turboderp
f3596fc0d9
Add Q6 cache mode
2024-06-09 01:23:50 +02:00
turboderp
f6abbba183
Add Q8 cache option to example chatbot
2024-06-08 22:40:12 +02:00
turboderp
a3d18564ff
Update examples
2024-05-22 22:22:06 +02:00
Thanasis Galianis
6d7d39b155
Update chat.py
2024-03-18 22:38:20 +02:00
Thanasis Galianis
4ca5ca35a6
Added --gpu_split explanation on examples/chat.py
2024-03-18 21:35:01 +02:00
turboderp
65ed844060
Option to limit scratch space for output layer
2024-03-16 13:35:21 +01:00
turboderp
dc6d196b29
ngram option for chat example
2024-03-13 06:11:38 +01:00
turboderp
fc10817a78
ngram option for chat example
2024-03-13 05:39:39 +01:00
turboderp
082a9fe9df
Fix Q4 cache in chat example
2024-03-06 19:13:21 +01:00
turboderp
eb8269726f
Update examples
2024-03-06 02:41:23 +01:00
turboderp
d09f97aedc
Add Q4 option to chat example
2024-03-05 00:29:12 +01:00
turboderp
1de4cdd70b
Add skew sampling
2024-02-25 15:53:31 +01:00
turboderp
69fba75225
Add Gemma prompt format to example chatbot
2024-02-22 14:43:42 +01:00
turboderp
9f8951e63b
More typeable arg shortcut
2024-02-02 15:03:16 +01:00
Alexander Abushady
8461e6fa76
Kalomaze's Quadratic Sampling
...
Quadratic Sampling
2024-02-01 00:11:44 -05:00
turboderp
8c9a3ecb49
Add dyn temp options to chat example
2024-01-30 17:51:59 +01:00
AlpinDale
a531dea6a0
Merge branch 'turboderp:master' into feat/frequency_presence_pen
2023-12-23 01:42:00 +00:00
AlpinDale
1384eb540a
add frequency and presence penalties
2023-12-21 17:19:47 +00:00
AlpinDale
5131099b5f
add top_a in a few more places
2023-12-21 15:28:34 +00:00
turboderp
5c974259bd
More sensible defaults sampling parameters
2023-12-03 22:09:41 +01:00
Sinan Akkoyun
81111ee911
Added draft model rope scale
2023-12-03 06:14:49 +00:00
turboderp
a9ebe04b0b
Add amnesia option to chatbot
2023-12-01 19:10:58 +01:00
turboderp
7a783b3824
Update examples (auto GPU split)
2023-10-22 19:32:26 +02:00
turboderp
fb350d76ed
Add 8-bit cache mode to chatbot
2023-10-15 23:16:21 +02:00
turboderp
c2efd2c00c
Apply alpha scaling to draft model when necessary
...
Collect some metrics on speculative decoding
2023-10-14 22:30:59 +02:00
turboderp
07170069e2
Add option to print timings to chatbot
2023-10-14 00:30:59 +02:00
turboderp
5db5cdfda7
Add draft model option (speculative decoding) to chat example
2023-10-13 23:34:17 +02:00
turboderp
f27ab60d1b
Rework code formatting in chat example
2023-10-08 01:22:31 +02:00
Sinan
fe047c405f
Merge branch 'turboderp:master' into code-chat
2023-10-05 00:19:37 +02:00
turboderp
5dec977006
Refactor chat example, split out prompt formats, add working option for TinyLlama-chat
2023-10-04 23:18:45 +02:00
turboderp
d09a3fa000
Add Orca prompt format to chat example
2023-10-04 01:44:57 +02:00
SinanAkkoyun
2c9b122c12
Fixed Mistral 7B codeblock delim chunking (` + )
2023-10-02 23:31:09 +02:00
turboderp
d3217f0e4c
Refactor code formatting, integrate in chatbot example
2023-10-01 12:51:20 +02:00
turboderp
ba5f6191c8
Add typical setting to chat example.
2023-09-26 19:50:44 +02:00
turboderp
19e164eea2
CodeLlama system prompt
2023-09-09 14:53:02 +02:00
turboderp
4b98d98a5c
Fix bug in 6-bit matrix preproc
2023-09-06 08:47:09 +02:00
turboderp
7964c73241
Add sampling settings as cmdline options to chat example
2023-09-05 14:32:02 +02:00
turboderp
e7b50fedcb
Fix chat example Llama mode (EOS was appended twice)
2023-09-05 14:24:53 +02:00
turboderp
fb0825207f
Fix chat example Llama mode (EOS was appended twice)
2023-09-05 14:22:34 +02:00