Commit Graph

12 Commits

Author SHA1 Message Date
layerdiffusion
2f1d04759f avoid some mysteries problems when using lots of python local delegations 2024-08-19 09:47:04 -07:00
layerdiffusion
d38e560e42 Implement some rethinking about LoRA system
1. Add an option to allow users to use UNet in fp8/gguf but lora in fp16.
2. All FP16 loras do not need patch. Others will only patch again when lora weight change.
3. FP8 unet + fp16 lora are available (somewhat only available) in Forge now. This also solves some “LoRA too subtle” problems.
4. Significantly speed up all gguf models (in Async mode) by using independent thread (CUDA stream) to compute and dequant at the same time, even when low-bit weights are already on GPU.
5. View “online lora” as a module similar to ControlLoRA so that it is moved to GPU together with model when sampling, achieving significant speedup and perfect low VRAM management simultaneously.
2024-08-19 04:31:59 -07:00
layerdiffusion
53cd00d125 revise 2024-08-17 23:03:50 -07:00
layerdiffusion
db5a876d4c completely solve all LoRA OOMs 2024-08-17 22:43:20 -07:00
layerdiffusion
ab4b0d5b58 fix some mem leak 2024-08-17 00:19:43 -07:00
layerdiffusion
9973d5dc09 better prints 2024-08-16 21:13:09 -07:00
layerdiffusion
f3e211d431 fix bnb lora 2024-08-16 21:09:14 -07:00
layerdiffusion
12369669cf only load lora one time 2024-08-16 02:02:22 -07:00
layerdiffusion
a0849953bd revise 2024-08-13 15:13:39 -07:00
layerdiffusion
00f1cd36bd multiple lora implementation sources 2024-08-13 07:13:32 -07:00
layerdiffusion
be3f0a0039 remove legacy config 2024-08-03 17:04:56 -07:00
layerdiffusion
d1b8a2676d rework lora and patching system
and dora etc - backend rework is 60% finished
And I also removed the webui’s extremely annoying lora filter from model versions.
2024-08-02 13:45:26 -07:00