Commit Graph

33 Commits

Author SHA1 Message Date
turboderp
082a9fe9df Fix Q4 cache in chat example 2024-03-06 19:13:21 +01:00
turboderp
eb8269726f Update examples 2024-03-06 02:41:23 +01:00
turboderp
d09f97aedc Add Q4 option to chat example 2024-03-05 00:29:12 +01:00
turboderp
1de4cdd70b Add skew sampling 2024-02-25 15:53:31 +01:00
turboderp
69fba75225 Add Gemma prompt format to example chatbot 2024-02-22 14:43:42 +01:00
turboderp
9f8951e63b More typeable arg shortcut 2024-02-02 15:03:16 +01:00
Alexander Abushady
8461e6fa76 Kalomaze's Quadratic Sampling
Quadratic Sampling
2024-02-01 00:11:44 -05:00
turboderp
8c9a3ecb49 Add dyn temp options to chat example 2024-01-30 17:51:59 +01:00
AlpinDale
a531dea6a0 Merge branch 'turboderp:master' into feat/frequency_presence_pen 2023-12-23 01:42:00 +00:00
AlpinDale
1384eb540a add frequency and presence penalties 2023-12-21 17:19:47 +00:00
AlpinDale
5131099b5f add top_a in a few more places 2023-12-21 15:28:34 +00:00
turboderp
5c974259bd More sensible defaults sampling parameters 2023-12-03 22:09:41 +01:00
Sinan Akkoyun
81111ee911 Added draft model rope scale 2023-12-03 06:14:49 +00:00
turboderp
a9ebe04b0b Add amnesia option to chatbot 2023-12-01 19:10:58 +01:00
turboderp
7a783b3824 Update examples (auto GPU split) 2023-10-22 19:32:26 +02:00
turboderp
fb350d76ed Add 8-bit cache mode to chatbot 2023-10-15 23:16:21 +02:00
turboderp
c2efd2c00c Apply alpha scaling to draft model when necessary
Collect some metrics on speculative decoding
2023-10-14 22:30:59 +02:00
turboderp
07170069e2 Add option to print timings to chatbot 2023-10-14 00:30:59 +02:00
turboderp
5db5cdfda7 Add draft model option (speculative decoding) to chat example 2023-10-13 23:34:17 +02:00
turboderp
f27ab60d1b Rework code formatting in chat example 2023-10-08 01:22:31 +02:00
Sinan
fe047c405f Merge branch 'turboderp:master' into code-chat 2023-10-05 00:19:37 +02:00
turboderp
5dec977006 Refactor chat example, split out prompt formats, add working option for TinyLlama-chat 2023-10-04 23:18:45 +02:00
turboderp
d09a3fa000 Add Orca prompt format to chat example 2023-10-04 01:44:57 +02:00
SinanAkkoyun
2c9b122c12 Fixed Mistral 7B codeblock delim chunking (` + ) 2023-10-02 23:31:09 +02:00
turboderp
d3217f0e4c Refactor code formatting, integrate in chatbot example 2023-10-01 12:51:20 +02:00
turboderp
ba5f6191c8 Add typical setting to chat example. 2023-09-26 19:50:44 +02:00
turboderp
19e164eea2 CodeLlama system prompt 2023-09-09 14:53:02 +02:00
turboderp
4b98d98a5c Fix bug in 6-bit matrix preproc 2023-09-06 08:47:09 +02:00
turboderp
7964c73241 Add sampling settings as cmdline options to chat example 2023-09-05 14:32:02 +02:00
turboderp
e7b50fedcb Fix chat example Llama mode (EOS was appended twice) 2023-09-05 14:24:53 +02:00
turboderp
fb0825207f Fix chat example Llama mode (EOS was appended twice) 2023-09-05 14:22:34 +02:00
turboderp
3c80d41234 Add 4-bit GPTQ support 2023-09-05 14:03:51 +02:00
turboderp
bb83469574 Initial commit 2023-08-30 11:05:23 +02:00