turboderp
|
082a9fe9df
|
Fix Q4 cache in chat example
|
2024-03-06 19:13:21 +01:00 |
|
turboderp
|
eb8269726f
|
Update examples
|
2024-03-06 02:41:23 +01:00 |
|
turboderp
|
d09f97aedc
|
Add Q4 option to chat example
|
2024-03-05 00:29:12 +01:00 |
|
turboderp
|
1de4cdd70b
|
Add skew sampling
|
2024-02-25 15:53:31 +01:00 |
|
turboderp
|
69fba75225
|
Add Gemma prompt format to example chatbot
|
2024-02-22 14:43:42 +01:00 |
|
turboderp
|
9f8951e63b
|
More typeable arg shortcut
|
2024-02-02 15:03:16 +01:00 |
|
Alexander Abushady
|
8461e6fa76
|
Kalomaze's Quadratic Sampling
Quadratic Sampling
|
2024-02-01 00:11:44 -05:00 |
|
turboderp
|
8c9a3ecb49
|
Add dyn temp options to chat example
|
2024-01-30 17:51:59 +01:00 |
|
AlpinDale
|
a531dea6a0
|
Merge branch 'turboderp:master' into feat/frequency_presence_pen
|
2023-12-23 01:42:00 +00:00 |
|
AlpinDale
|
1384eb540a
|
add frequency and presence penalties
|
2023-12-21 17:19:47 +00:00 |
|
AlpinDale
|
5131099b5f
|
add top_a in a few more places
|
2023-12-21 15:28:34 +00:00 |
|
turboderp
|
5c974259bd
|
More sensible defaults sampling parameters
|
2023-12-03 22:09:41 +01:00 |
|
Sinan Akkoyun
|
81111ee911
|
Added draft model rope scale
|
2023-12-03 06:14:49 +00:00 |
|
turboderp
|
a9ebe04b0b
|
Add amnesia option to chatbot
|
2023-12-01 19:10:58 +01:00 |
|
turboderp
|
7a783b3824
|
Update examples (auto GPU split)
|
2023-10-22 19:32:26 +02:00 |
|
turboderp
|
fb350d76ed
|
Add 8-bit cache mode to chatbot
|
2023-10-15 23:16:21 +02:00 |
|
turboderp
|
c2efd2c00c
|
Apply alpha scaling to draft model when necessary
Collect some metrics on speculative decoding
|
2023-10-14 22:30:59 +02:00 |
|
turboderp
|
07170069e2
|
Add option to print timings to chatbot
|
2023-10-14 00:30:59 +02:00 |
|
turboderp
|
5db5cdfda7
|
Add draft model option (speculative decoding) to chat example
|
2023-10-13 23:34:17 +02:00 |
|
turboderp
|
f27ab60d1b
|
Rework code formatting in chat example
|
2023-10-08 01:22:31 +02:00 |
|
Sinan
|
fe047c405f
|
Merge branch 'turboderp:master' into code-chat
|
2023-10-05 00:19:37 +02:00 |
|
turboderp
|
5dec977006
|
Refactor chat example, split out prompt formats, add working option for TinyLlama-chat
|
2023-10-04 23:18:45 +02:00 |
|
turboderp
|
d09a3fa000
|
Add Orca prompt format to chat example
|
2023-10-04 01:44:57 +02:00 |
|
SinanAkkoyun
|
2c9b122c12
|
Fixed Mistral 7B codeblock delim chunking (` + )
|
2023-10-02 23:31:09 +02:00 |
|
turboderp
|
d3217f0e4c
|
Refactor code formatting, integrate in chatbot example
|
2023-10-01 12:51:20 +02:00 |
|
turboderp
|
ba5f6191c8
|
Add typical setting to chat example.
|
2023-09-26 19:50:44 +02:00 |
|
turboderp
|
19e164eea2
|
CodeLlama system prompt
|
2023-09-09 14:53:02 +02:00 |
|
turboderp
|
4b98d98a5c
|
Fix bug in 6-bit matrix preproc
|
2023-09-06 08:47:09 +02:00 |
|
turboderp
|
7964c73241
|
Add sampling settings as cmdline options to chat example
|
2023-09-05 14:32:02 +02:00 |
|
turboderp
|
e7b50fedcb
|
Fix chat example Llama mode (EOS was appended twice)
|
2023-09-05 14:24:53 +02:00 |
|
turboderp
|
fb0825207f
|
Fix chat example Llama mode (EOS was appended twice)
|
2023-09-05 14:22:34 +02:00 |
|
turboderp
|
3c80d41234
|
Add 4-bit GPTQ support
|
2023-09-05 14:03:51 +02:00 |
|
turboderp
|
bb83469574
|
Initial commit
|
2023-08-30 11:05:23 +02:00 |
|