Commit Graph

13 Commits

Author SHA1 Message Date
layerdiffusion
96f264ec6a add a way to save models 2024-08-19 06:30:49 -07:00
layerdiffusion
d38e560e42 Implement some rethinking about LoRA system
1. Add an option to allow users to use UNet in fp8/gguf but lora in fp16.
2. All FP16 loras do not need patch. Others will only patch again when lora weight change.
3. FP8 unet + fp16 lora are available (somewhat only available) in Forge now. This also solves some “LoRA too subtle” problems.
4. Significantly speed up all gguf models (in Async mode) by using independent thread (CUDA stream) to compute and dequant at the same time, even when low-bit weights are already on GPU.
5. View “online lora” as a module similar to ControlLoRA so that it is moved to GPU together with model when sampling, achieving significant speedup and perfect low VRAM management simultaneously.
2024-08-19 04:31:59 -07:00
layerdiffusion
8a04293430 fix some gguf loras 2024-08-17 01:15:37 -07:00
layerdiffusion
f3e211d431 fix bnb lora 2024-08-16 21:09:14 -07:00
layerdiffusion
616b335fce move file 2024-08-15 05:45:55 -07:00
layerdiffusion
1bd6cf0e0c Support LoRAs for Q8/Q5/Q4 GGUF Models
what a crazy night of math
2024-08-15 05:34:46 -07:00
layerdiffusion
d8b83a9501 gguf preview 2024-08-15 00:03:32 -07:00
layerdiffusion
1a26e73deb revise 2024-08-14 17:25:32 -07:00
layerdiffusion
b09c24ef51 add fp16_fix 2024-08-14 17:10:03 -07:00
lllyasviel
14a759b5ca revise kernel 2024-08-07 13:28:12 -07:00
layerdiffusion
07b2d2ccac clipvision, ipadapter, and misc
backend is 75% finished
2024-08-03 14:18:16 -07:00
layerdiffusion
e722991752 control rework 2024-08-02 22:17:27 -07:00
layerdiffusion
d1b8a2676d rework lora and patching system
and dora etc - backend rework is 60% finished
And I also removed the webui’s extremely annoying lora filter from model versions.
2024-08-02 13:45:26 -07:00