Added info, config, etc for lora extracotr and slider trainer

This commit is contained in:
Jaret Burkett
2023-07-23 13:13:45 -06:00
parent 9367089d48
commit 452f2a6da2
4 changed files with 86 additions and 35 deletions

108
README.md
View File

@@ -1,60 +1,106 @@
# AI Toolkit by Ostris # AI Toolkit by Ostris
WIP for now, but will be a collection of tools for AI tools as I need them. ## IMPORTANT NOTE - READ THIS
This is an active WIP repo that is not ready for others to use. And definitely not ready for non developers to use.
I am making major breaking changes and pushing straight to master until I have it in a planned state. I have big changes
planned for config files and the general structure. I may change how training works entirely. You are welcome to use
but keep that in mind. If more people start to use it, I will follow better branch checkout standards, but for now
this is my personal active experiment.
Report bugs as you find them, but not knowing how to train ML models, setup an environment, or use python is not a bug.
I will make all of this more user-friendly eventually
I will make a better readme later.
## Installation ## Installation
I will try to update this to be more beginner-friendly, but for now I am assuming Requirements:
a general understanding of python, pip, pytorch, and using virtual environments: - python >3.10
- Nvidia GPU with enough ram to do what you need
- python venv
- git
Linux: Linux:
```bash ```bash
git clone https://github.com/ostris/ai-toolkit.git
cd ai-toolkit
git submodule update --init --recursive git submodule update --init --recursive
pythion3 -m venv venv python3 -m venv venv
source venv/bin/activate source venv/bin/activate
pip install -r requirements.txt # or source venv/Scripts/activate on windows
cd requirements/sd-scripts pip3 install -r requirements.txt
pip install --no-deps -e .
cd ../..
``` ```
Windows: ---
```bash
git submodule update --init --recursive
pythion3 -m venv venv
venv\Scripts\activate
pip install -r requirements.txt
cd requirements/sd-scripts
pip install --no-deps -e .
cd ../..
```
## Current Tools ## Current Tools
### LyCORIS extractor I have so many hodge podge scripts I am going to be moving over to this that I use in my ML work. But this is what is
here so far.
It is similar to the [LyCORIS](https://github.com/KohakuBlueleaf/LyCORIS) tool, but adding some QOL features. ### LoRA (lierla), LoCON (LyCORIS) extractor
It all runs off a config file, which you can find an example of in `config/examples/locon_config.example.json`.
Just copy that file, into the `config` folder, and rename it to `whatever_you_want.json`. It is based on the extractor in the [LyCORIS](https://github.com/KohakuBlueleaf/LyCORIS) tool, but adding some QOL features
and LoRA (lierla) support. It can do multiple types of extractions in one run.
It all runs off a config file, which you can find an example of in `config/examples/extract.example.yml`.
Just copy that file, into the `config` folder, and rename it to `whatever_you_want.yml`.
Then you can edit the file to your liking. and call it like so: Then you can edit the file to your liking. and call it like so:
```bash ```bash
python3 run.py "whatever_you_want" python3 run.py config/whatever_you_want.yml
``` ```
You can also put a full path to a config file, if you want to keep it somewhere else. You can also put a full path to a config file, if you want to keep it somewhere else.
```bash ```bash
python3 run.py "/home/user/whatever_you_want.json" python3 run.py "/home/user/whatever_you_want.yml"
``` ```
File name is auto generated and dumped into the `output` folder. You can put whatever meta you want in the More notes on how it works are available in the example config file itself. LoRA and LoCON both support
`meta` section of the config file, and it will be added to the metadata of the output file. I just have extractions of 'fixed', 'threshold', 'ratio', 'quantile'. I'll update what these do and mean later.
some recommended fields in the example file. The script will add some other useful metadata as well. Most people used fixed, which is traditional fixed dimension extraction.
process is an array or different processes to run on the conversion to test. You will normally just need one though. `process` is an array of different processes to run. You can add a few and mix and match. One LoRA, one LyCON, etc.
Will update this later.
### LoRA Slider Trainer
This is how I train most of the recent sliders I have on Civitai, you can check them out in my [Civitai profile](https://civitai.com/user/Ostris/models).
It is based off the work by [p1atdev/LECO](https://github.com/p1atdev/LECO) and [rohitgandikota/erasing](https://github.com/rohitgandikota/erasing)
But has been heavily modified to create sliders rather than erasing concepts. I have a lot more plans on this, but it is
very functional as is. It is also very easy to use. Just copy the example config file in `config/examples/train_slider.example.yml`
to the `config` folder and rename it to `whatever_you_want.yml`. Then you can edit the file to your liking. and call it like so:
```bash
python3 run.py config/whatever_you_want.yml
```
There is a lot more information in that example file. You can even run the example as is without any modifications to see
how it works. It will create a slider that turns all animals into dogs(neg) or cats(pos). Just run it like so:
```bash
python3 run.py config/examples/train_slider.example.yml
```
And you will be able to see how it works without configuring anything. No datasets are required for this method.
I will post an better tutorial soon.
---
## WIP Tools
### VAE (Variational Auto Encoder) Trainer
This works, but is not ready for others to use and therefore does not have an example config.
I am still working on it. I will update this when it is ready.
I am adding a lot of features for criteria that I have used in my image enlargement work. A Critic (discriminator),
content loss, style loss, and a few more. If you don't know, the VAE
for stable diffusion (yes even the MSE one, and SDXL), are horrible at smaller faces and it holds SD back. I will fix this.
I'll post more about this later with better examples later, but here is a quick test of a run through with various VAEs.
Just went in and out. It is much worse on smaller faces than shown here.
<img src="https://raw.githubusercontent.com/ostris/ai-toolkit/main/assets/VAE_test1.jpg" width="768" height="auto">

BIN
assets/VAE_test1.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.4 MiB

View File

@@ -47,7 +47,8 @@ config:
- type: lora # traditional lora extraction (lierla) with linear layers only - type: lora # traditional lora extraction (lierla) with linear layers only
filename: "[name]_4.safetensors" filename: "[name]_4.safetensors"
mode: fixed # fixed, ratio, quantile supported for lora as well mode: fixed # fixed, ratio, quantile supported for lora as well
linear: 4 linear: 4 # lora dim or rank
# no conv for lora
# process 5 # process 5
- type: lora - type: lora

View File

@@ -33,7 +33,7 @@ config:
# how many steps to train. More is not always better. I rarely go over 1000 # how many steps to train. More is not always better. I rarely go over 1000
steps: 500 steps: 500
# I have had good results with 4e-4 to 1e-4 at 500 steps # I have had good results with 4e-4 to 1e-4 at 500 steps
lr: 2e-4 lr: 1e-4
# train the unet. I recommend leaving this true # train the unet. I recommend leaving this true
train_unet: true train_unet: true
# train the text encoder. I don't recommend this unless you have a special use case # train the text encoder. I don't recommend this unless you have a special use case
@@ -70,7 +70,7 @@ config:
# saving config # saving config
save: save:
dtype: float16 # precision to save. I recommend float16 dtype: float16 # precision to save. I recommend float16
save_every: 100 # save every this many steps save_every: 50 # save every this many steps
# sampling config # sampling config
sample: sample:
@@ -90,7 +90,7 @@ config:
# --n [string] # negative prompt, will inherit sample.neg if not set # --n [string] # negative prompt, will inherit sample.neg if not set
# Only 75 tokens allowed currently # Only 75 tokens allowed currently
prompts: prompts: # our example is an animal slider, neg: dog, pos: cat
- "a golden retriever --m -5" - "a golden retriever --m -5"
- "a golden retriever --m -3" - "a golden retriever --m -3"
- "a golden retriever --m 3" - "a golden retriever --m 3"
@@ -99,6 +99,10 @@ config:
- "calico cat --m -3" - "calico cat --m -3"
- "calico cat --m 3" - "calico cat --m 3"
- "calico cat --m 5" - "calico cat --m 5"
- "an elephant --m -5"
- "an elephant --m -3"
- "an elephant --m 3"
- "an elephant --m 5"
# negative prompt used on all prompts above as default if they don't have one # negative prompt used on all prompts above as default if they don't have one
neg: "cartoon, fake, drawing, illustration, cgi, animated, anime, monochrome" neg: "cartoon, fake, drawing, illustration, cgi, animated, anime, monochrome"
# seed for sampling. 42 is the answer for everything # seed for sampling. 42 is the answer for everything