Commit Graph

14 Commits

Author SHA1 Message Date
turboderp
99b19ec5f1 Cleanup examples a bit 2024-01-20 10:57:16 +01:00
turboderp
41b15dd1c3 Refactor to consolidate attn params 2024-01-04 04:52:49 +01:00
AlpinDale
5131099b5f add top_a in a few more places 2023-12-21 15:28:34 +00:00
turboderp
5c974259bd More sensible defaults sampling parameters 2023-12-03 22:09:41 +01:00
turboderp
dfd0bcf888 Revert example 2023-11-22 07:23:43 +01:00
turboderp
5886047b15 Don't update setuptools 2023-11-22 07:07:48 +01:00
turboderp
7a783b3824 Update examples (auto GPU split) 2023-10-22 19:32:26 +02:00
turboderp
c136b2284c Add token healing 2023-09-29 22:33:51 +02:00
Jeff Kerr
c221ec3630 add comment on model.load() usage 2023-09-13 11:25:49 -04:00
turboderp
b4afc666dd Clean up examples 2023-09-10 14:16:42 +02:00
turboderp
f79e16c5d0 Optimization, wider loads in EXL2 kernel (int4) 2023-09-07 10:56:43 +02:00
turboderp
f259fafda9 Optimization, wider loads in GPTQ kernel (int2) 2023-09-07 03:03:02 +02:00
turboderp
6d576b3e56 Reworking attention, allow for batched inference with independent cache per sequence 2023-09-03 15:56:38 +02:00
turboderp
bb83469574 Initial commit 2023-08-30 11:05:23 +02:00