turboderp
|
157fcfb5b9
|
Add license
|
2023-09-12 06:52:09 +02:00 |
|
turboderp
|
8e3cd01889
|
Update README.md
|
2023-09-12 06:52:04 +02:00 |
|
turboderp
|
546c91482e
|
Add FUNDING.yml
|
2023-09-12 06:42:55 +02:00 |
|
turboderp
|
c240eb0b70
|
Update README.md
|
2023-09-12 06:41:36 +02:00 |
|
turboderp
|
7704a6877b
|
Fix VRAM usage estimate for linear layer spanning multiple shards
|
2023-09-11 19:20:24 +02:00 |
|
turboderp
|
c5c90a8b4b
|
Clean up imports
|
2023-09-11 07:31:43 +02:00 |
|
turboderp
|
27bccf02d3
|
Clean up Torch extension script
|
2023-09-11 07:21:55 +02:00 |
|
turboderp
|
6e14f8802b
|
Unsharding utility
|
2023-09-11 04:16:42 +02:00 |
|
turboderp
|
49c8d9e51d
|
Fix load quant tensors that span multiple shards
|
2023-09-11 04:16:17 +02:00 |
|
turboderp
|
5dc32f0f8c
|
Fix padding for head layer when vocab is extended
|
2023-09-10 20:12:15 +02:00 |
|
turboderp
|
8ca9a1896d
|
Add sharding utility
|
2023-09-10 18:59:05 +02:00 |
|
turboderp
|
8f9617dd5d
|
Fix image link
|
2023-09-10 18:58:54 +02:00 |
|
turboderp
|
f3ef397656
|
Add README.md
|
2023-09-10 15:42:38 +02:00 |
|
turboderp
|
ddaf503e98
|
Add README.md
|
2023-09-10 14:16:53 +02:00 |
|
turboderp
|
b4afc666dd
|
Clean up examples
|
2023-09-10 14:16:42 +02:00 |
|
turboderp
|
c0ade31bfe
|
Fix typo
|
2023-09-10 10:00:52 +02:00 |
|
turboderp
|
b389b474eb
|
Add ninja requirement
|
2023-09-10 10:00:27 +02:00 |
|
turboderp
|
2617b6c012
|
Setuptools script
|
2023-09-10 09:02:05 +02:00 |
|
turboderp
|
0ec776f53e
|
Add requirements.txt
|
2023-09-10 08:05:45 +02:00 |
|
turboderp
|
10899838ea
|
Add speculative generator and example
|
2023-09-10 06:22:27 +02:00 |
|
turboderp
|
48f0db78b2
|
Improved VRAM predictions
|
2023-09-10 06:17:44 +02:00 |
|
turboderp
|
918368b295
|
34B testing
|
2023-09-10 06:15:33 +02:00 |
|
turboderp
|
6046dcf39a
|
Util functions for mem debugging
|
2023-09-10 06:14:51 +02:00 |
|
turboderp
|
5d798a178a
|
Cleaning up converter
|
2023-09-09 14:54:23 +02:00 |
|
turboderp
|
952c67c4ff
|
Update defaults for convert script
|
2023-09-09 14:53:52 +02:00 |
|
turboderp
|
19e164eea2
|
CodeLlama system prompt
|
2023-09-09 14:53:02 +02:00 |
|
turboderp
|
18fe5d5a5a
|
Forward pass chunking adapted from V1
|
2023-09-08 08:15:14 +02:00 |
|
turboderp
|
0af5c8a413
|
Forward pass chunking adapted from V1
|
2023-09-08 08:11:40 +02:00 |
|
turboderp
|
d00c03ea69
|
Optimization: write K/V directly into cache when possible
|
2023-09-08 07:53:01 +02:00 |
|
turboderp
|
f9dc978e01
|
Fix and test fallback matmul mode
|
2023-09-07 18:15:42 +02:00 |
|
turboderp
|
f79e16c5d0
|
Optimization, wider loads in EXL2 kernel (int4)
|
2023-09-07 10:56:43 +02:00 |
|
turboderp
|
1075b7514f
|
Optimization, wider loads in GPTQ kernel (int4)
|
2023-09-07 04:26:45 +02:00 |
|
turboderp
|
c2f62e1f1f
|
Optimization, wider loads in GPTQ kernel (int2) working
|
2023-09-07 04:07:13 +02:00 |
|
turboderp
|
f259fafda9
|
Optimization, wider loads in GPTQ kernel (int2)
|
2023-09-07 03:03:02 +02:00 |
|
turboderp
|
a0cb4355c3
|
Fix regression in EXL2 convert
|
2023-09-06 08:47:38 +02:00 |
|
turboderp
|
4b98d98a5c
|
Fix bug in 6-bit matrix preproc
|
2023-09-06 08:47:09 +02:00 |
|
turboderp
|
7964c73241
|
Add sampling settings as cmdline options to chat example
|
2023-09-05 14:32:02 +02:00 |
|
turboderp
|
e7b50fedcb
|
Fix chat example Llama mode (EOS was appended twice)
|
2023-09-05 14:24:53 +02:00 |
|
turboderp
|
fb0825207f
|
Fix chat example Llama mode (EOS was appended twice)
|
2023-09-05 14:22:34 +02:00 |
|
turboderp
|
3c80d41234
|
Add 4-bit GPTQ support
|
2023-09-05 14:03:51 +02:00 |
|
turboderp
|
6d576b3e56
|
Reworking attention, allow for batched inference with independent cache per sequence
|
2023-09-03 15:56:38 +02:00 |
|
turboderp
|
4570f6ee17
|
Tidying up
|
2023-09-02 16:40:57 +02:00 |
|
turboderp
|
2a2cc16119
|
More kernel optimizin
|
2023-09-02 13:29:43 +02:00 |
|
turboderp
|
92ce76dec1
|
Kernel optimizations WIP
|
2023-09-02 05:37:00 +02:00 |
|
turboderp
|
c5cf3956dc
|
Add speed test
|
2023-09-01 12:03:00 +02:00 |
|
turboderp
|
a386102ac6
|
Improve prediction of VRAM usage when loading model
|
2023-09-01 10:47:29 +02:00 |
|
turboderp
|
176dbc43ad
|
CodeLlama rope_theta_support
|
2023-09-01 09:26:00 +02:00 |
|
turboderp
|
bb83469574
|
Initial commit
|
2023-08-30 11:05:23 +02:00 |
|
turboderp
|
9cc802c11a
|
Test commit
|
2023-08-30 11:03:34 +02:00 |
|
turboderp
|
03fb9db2e0
|
Initial commit
|
2023-08-30 10:54:23 +02:00 |
|