turboderp
|
99b19ec5f1
|
Cleanup examples a bit
|
2024-01-20 10:57:16 +01:00 |
|
turboderp
|
41b15dd1c3
|
Refactor to consolidate attn params
|
2024-01-04 04:52:49 +01:00 |
|
AlpinDale
|
5131099b5f
|
add top_a in a few more places
|
2023-12-21 15:28:34 +00:00 |
|
turboderp
|
5c974259bd
|
More sensible defaults sampling parameters
|
2023-12-03 22:09:41 +01:00 |
|
turboderp
|
dfd0bcf888
|
Revert example
|
2023-11-22 07:23:43 +01:00 |
|
turboderp
|
5886047b15
|
Don't update setuptools
|
2023-11-22 07:07:48 +01:00 |
|
turboderp
|
7a783b3824
|
Update examples (auto GPU split)
|
2023-10-22 19:32:26 +02:00 |
|
turboderp
|
c136b2284c
|
Add token healing
|
2023-09-29 22:33:51 +02:00 |
|
Jeff Kerr
|
c221ec3630
|
add comment on model.load() usage
|
2023-09-13 11:25:49 -04:00 |
|
turboderp
|
b4afc666dd
|
Clean up examples
|
2023-09-10 14:16:42 +02:00 |
|
turboderp
|
f79e16c5d0
|
Optimization, wider loads in EXL2 kernel (int4)
|
2023-09-07 10:56:43 +02:00 |
|
turboderp
|
f259fafda9
|
Optimization, wider loads in GPTQ kernel (int2)
|
2023-09-07 03:03:02 +02:00 |
|
turboderp
|
6d576b3e56
|
Reworking attention, allow for batched inference with independent cache per sequence
|
2023-09-03 15:56:38 +02:00 |
|
turboderp
|
bb83469574
|
Initial commit
|
2023-08-30 11:05:23 +02:00 |
|