layerdiffusion
68bf7f85aa
speed up nf4 lora in offline patching mode
2024-08-22 10:35:11 -07:00
layerdiffusion
95d04e5c8f
fix
2024-08-22 10:08:21 -07:00
layerdiffusion
14eac6f2cf
add a way to empty cuda cache on the fly
2024-08-22 10:06:39 -07:00
layerdiffusion
909ad6c734
fix prints
2024-08-21 22:24:54 -07:00
layerdiffusion
0d8eb4c5ba
fix #1375
2024-08-21 11:01:59 -07:00
layerdiffusion
4e3c78178a
[revised] change some dtype behaviors based on community feedbacks
...
only influence old devices like 1080/70/60/50.
please remove cmd flags if you are on 1080/70/60/50 and previously used many cmd flags to tune performance
2024-08-21 10:23:38 -07:00
layerdiffusion
1419ef29aa
Revert "change some dtype behaviors based on community feedbacks"
...
This reverts commit 31bed671ac .
2024-08-21 10:10:49 -07:00
layerdiffusion
31bed671ac
change some dtype behaviors based on community feedbacks
...
only influence old devices like 1080/70/60/50.
please remove cmd flags if you are on 1080/70/60/50 and previously used many cmd flags to tune performance
2024-08-21 08:46:52 -07:00
layerdiffusion
1096c708cc
revise swap module name
2024-08-20 21:18:53 -07:00
layerdiffusion
5452bc6ac3
All Forge Spaces Now Pass 4GB VRAM
...
and they all 100% reproduce author results
2024-08-20 08:01:10 -07:00
layerdiffusion
6f411a4940
fix loras on nf4 models when activate "loras in fp16"
2024-08-20 01:29:52 -07:00
layerdiffusion
475524496d
revise
2024-08-19 18:54:54 -07:00
layerdiffusion
d7151b4dcd
add low vram warning
2024-08-19 11:08:01 -07:00
layerdiffusion
2f1d04759f
avoid some mysteries problems when using lots of python local delegations
2024-08-19 09:47:04 -07:00
layerdiffusion
96f264ec6a
add a way to save models
2024-08-19 06:30:49 -07:00
layerdiffusion
d03fc5c2b1
speed up a bit
2024-08-19 05:06:46 -07:00
layerdiffusion
d38e560e42
Implement some rethinking about LoRA system
...
1. Add an option to allow users to use UNet in fp8/gguf but lora in fp16.
2. All FP16 loras do not need patch. Others will only patch again when lora weight change.
3. FP8 unet + fp16 lora are available (somewhat only available) in Forge now. This also solves some “LoRA too subtle” problems.
4. Significantly speed up all gguf models (in Async mode) by using independent thread (CUDA stream) to compute and dequant at the same time, even when low-bit weights are already on GPU.
5. View “online lora” as a module similar to ControlLoRA so that it is moved to GPU together with model when sampling, achieving significant speedup and perfect low VRAM management simultaneously.
2024-08-19 04:31:59 -07:00
layerdiffusion
e5f213c21e
upload some GGUF supports
2024-08-19 01:09:50 -07:00
layerdiffusion
53cd00d125
revise
2024-08-17 23:03:50 -07:00
layerdiffusion
db5a876d4c
completely solve all LoRA OOMs
2024-08-17 22:43:20 -07:00
layerdiffusion
8a04293430
fix some gguf loras
2024-08-17 01:15:37 -07:00
layerdiffusion
ab4b0d5b58
fix some mem leak
2024-08-17 00:19:43 -07:00
layerdiffusion
3da7de418a
fix layerdiffuse
2024-08-16 21:37:25 -07:00
layerdiffusion
9973d5dc09
better prints
2024-08-16 21:13:09 -07:00
layerdiffusion
f3e211d431
fix bnb lora
2024-08-16 21:09:14 -07:00
layerdiffusion
2f0555f7dc
GPU Shared Async Swap for all GGUF/BNB
2024-08-16 08:45:17 -07:00
layerdiffusion
04e7f05769
speedup swap/loading of all quant types
2024-08-16 08:30:11 -07:00
layerdiffusion
394da01959
simplify
2024-08-16 04:55:01 -07:00
layerdiffusion
e36487ffa5
tune
2024-08-16 04:49:25 -07:00
lllyasviel
6e6e5c2162
do some profile on 3090
2024-08-16 04:43:19 -07:00
layerdiffusion
7c0f78e424
reduce cast
2024-08-16 03:59:59 -07:00
layerdiffusion
12369669cf
only load lora one time
2024-08-16 02:02:22 -07:00
layerdiffusion
243952f364
wip qx_1 loras
2024-08-15 17:07:41 -07:00
layerdiffusion
f510f51303
speedup lora patching
2024-08-15 06:51:52 -07:00
layerdiffusion
141cf81c23
sometimes it is not diffusion model
2024-08-15 06:36:59 -07:00
layerdiffusion
021428da26
fix nf4 lora gives pure noise on some devices
2024-08-15 06:35:15 -07:00
layerdiffusion
3d751eb69f
move file
2024-08-15 05:46:35 -07:00
layerdiffusion
616b335fce
move file
2024-08-15 05:45:55 -07:00
layerdiffusion
1bd6cf0e0c
Support LoRAs for Q8/Q5/Q4 GGUF Models
...
what a crazy night of math
2024-08-15 05:34:46 -07:00
layerdiffusion
2690b654fd
reimplement q8/q85/q4 and review and match official gguf
2024-08-15 02:41:15 -07:00
layerdiffusion
ce16d34d03
disable xformers for t5
2024-08-15 00:55:49 -07:00
layerdiffusion
d336597fa5
add note to lora
...
but loras for NF4 is done already!
2024-08-15 00:42:48 -07:00
layerdiffusion
7fcfb93090
ling
2024-08-15 00:39:12 -07:00
layerdiffusion
0524133714
ling
2024-08-15 00:33:21 -07:00
layerdiffusion
fb62214a32
rewrite some functions
2024-08-15 00:29:19 -07:00
layerdiffusion
c74f603ea2
remove super call
2024-08-15 00:23:31 -07:00
layerdiffusion
d0518b7249
make prints beautiful
2024-08-15 00:20:03 -07:00
layerdiffusion
d8b83a9501
gguf preview
2024-08-15 00:03:32 -07:00
layerdiffusion
59790f2cb4
simplify codes
2024-08-14 20:48:39 -07:00
layerdiffusion
4b66cf1126
fix possible OOM again
2024-08-14 20:45:58 -07:00