turboderp
|
cedeb616ce
|
Support Qwen2
|
2024-02-15 20:50:24 +01:00 |
|
turboderp
|
702dd9740a
|
VRAM optimizations during quant
|
2024-02-15 20:03:47 +01:00 |
|
turboderp
|
305982de43
|
Expand range for quantized parameter search
|
2024-01-30 20:22:44 +01:00 |
|
turboderp
|
7d37b50d90
|
Fix typos
|
2024-01-09 07:12:38 +01:00 |
|
turboderp
|
e089313afd
|
Reset norm
|
2024-01-09 05:30:15 +01:00 |
|
turboderp
|
6e214f59c7
|
Optimize conversion kernels
|
2024-01-08 03:40:40 +01:00 |
|
turboderp
|
02ce583318
|
Optimize VRAM usage a bit for quantizer
|
2023-12-26 00:00:37 +01:00 |
|
turboderp
|
0d63d6479c
|
Rework quantization and optimization
|
2023-12-13 01:00:11 +01:00 |
|
turboderp
|
644805adba
|
Reduce VRAM usage when quantizing
|
2023-12-02 17:18:53 +01:00 |
|
turboderp
|
714a19ca8f
|
Allow irregular group sizes
|
2023-11-26 16:53:29 +01:00 |
|
turboderp
|
02b4e65ba1
|
Cleanup TODO items
|
2023-10-22 17:57:37 +02:00 |
|
turboderp
|
4375e6b535
|
Catch edge case where torch.cholensky_inverse returns NaN tensor instead of throwing
Increase no. attempts at damping before failing
Remove enforcement of symmetry (seems to never be relevant)
|
2023-09-20 10:01:38 +02:00 |
|
turboderp
|
5c247f93aa
|
More memory tweaks, made swapping states to CPU the default to accommodate quanting 70B on 24GB GPUs
|
2023-09-16 19:15:29 +02:00 |
|
turboderp
|
9e55e44bcb
|
Optimize memory usage when quantizing, increase damping factor
|
2023-09-16 15:08:55 +02:00 |
|
turboderp
|
aee7a28170
|
Set default damping to .01 in line with GPTQ, increase damping if Hessian is not PD, disable removal of "dead" weights
|
2023-09-16 05:59:42 +02:00 |
|
turboderp
|
c5c90a8b4b
|
Clean up imports
|
2023-09-11 07:31:43 +02:00 |
|
turboderp
|
5dc32f0f8c
|
Fix padding for head layer when vocab is extended
|
2023-09-10 20:12:15 +02:00 |
|
turboderp
|
5d798a178a
|
Cleaning up converter
|
2023-09-09 14:54:23 +02:00 |
|
turboderp
|
bb83469574
|
Initial commit
|
2023-08-30 11:05:23 +02:00 |
|