turboderp
|
5dec977006
|
Refactor chat example, split out prompt formats, add working option for TinyLlama-chat
|
2023-10-04 23:18:45 +02:00 |
|
turboderp
|
d09a3fa000
|
Add Orca prompt format to chat example
|
2023-10-04 01:44:57 +02:00 |
|
turboderp
|
d3217f0e4c
|
Refactor code formatting, integrate in chatbot example
|
2023-10-01 12:51:20 +02:00 |
|
turboderp
|
51a0104bba
|
WebSocket server (WIP)
|
2023-09-30 23:52:11 +02:00 |
|
turboderp
|
0961876eb2
|
Merge pull request #71 from SinanAkkoyun/code-chat
Code highlighting in chat CLI
|
2023-09-29 23:31:40 +02:00 |
|
turboderp
|
c136b2284c
|
Add token healing
|
2023-09-29 22:33:51 +02:00 |
|
Sinan Akkoyun
|
fa23466f68
|
Really fixed the codeblock lang problem lol
|
2023-09-29 16:25:38 +02:00 |
|
Sinan Akkoyun
|
4f6f37c4a4
|
Removed lang after ``` in output
|
2023-09-29 16:17:16 +02:00 |
|
Sinan Akkoyun
|
2a43d3069d
|
Added codeblock highlighting to chatcode.py
|
2023-09-29 15:57:28 +02:00 |
|
turboderp
|
ba5f6191c8
|
Add typical setting to chat example.
|
2023-09-26 19:50:44 +02:00 |
|
Jeff Kerr
|
c221ec3630
|
add comment on model.load() usage
|
2023-09-13 11:25:49 -04:00 |
|
turboderp
|
c5c90a8b4b
|
Clean up imports
|
2023-09-11 07:31:43 +02:00 |
|
turboderp
|
b4afc666dd
|
Clean up examples
|
2023-09-10 14:16:42 +02:00 |
|
turboderp
|
10899838ea
|
Add speculative generator and example
|
2023-09-10 06:22:27 +02:00 |
|
turboderp
|
19e164eea2
|
CodeLlama system prompt
|
2023-09-09 14:53:02 +02:00 |
|
turboderp
|
f79e16c5d0
|
Optimization, wider loads in EXL2 kernel (int4)
|
2023-09-07 10:56:43 +02:00 |
|
turboderp
|
f259fafda9
|
Optimization, wider loads in GPTQ kernel (int2)
|
2023-09-07 03:03:02 +02:00 |
|
turboderp
|
4b98d98a5c
|
Fix bug in 6-bit matrix preproc
|
2023-09-06 08:47:09 +02:00 |
|
turboderp
|
7964c73241
|
Add sampling settings as cmdline options to chat example
|
2023-09-05 14:32:02 +02:00 |
|
turboderp
|
e7b50fedcb
|
Fix chat example Llama mode (EOS was appended twice)
|
2023-09-05 14:24:53 +02:00 |
|
turboderp
|
fb0825207f
|
Fix chat example Llama mode (EOS was appended twice)
|
2023-09-05 14:22:34 +02:00 |
|
turboderp
|
3c80d41234
|
Add 4-bit GPTQ support
|
2023-09-05 14:03:51 +02:00 |
|
turboderp
|
6d576b3e56
|
Reworking attention, allow for batched inference with independent cache per sequence
|
2023-09-03 15:56:38 +02:00 |
|
turboderp
|
4570f6ee17
|
Tidying up
|
2023-09-02 16:40:57 +02:00 |
|
turboderp
|
bb83469574
|
Initial commit
|
2023-08-30 11:05:23 +02:00 |
|