mirror of
https://github.com/ostris/ai-toolkit.git
synced 2026-01-26 16:39:47 +00:00
Added info, config, etc for lora extracotr and slider trainer
This commit is contained in:
108
README.md
108
README.md
@@ -1,60 +1,106 @@
|
|||||||
# AI Toolkit by Ostris
|
# AI Toolkit by Ostris
|
||||||
|
|
||||||
WIP for now, but will be a collection of tools for AI tools as I need them.
|
## IMPORTANT NOTE - READ THIS
|
||||||
|
This is an active WIP repo that is not ready for others to use. And definitely not ready for non developers to use.
|
||||||
|
I am making major breaking changes and pushing straight to master until I have it in a planned state. I have big changes
|
||||||
|
planned for config files and the general structure. I may change how training works entirely. You are welcome to use
|
||||||
|
but keep that in mind. If more people start to use it, I will follow better branch checkout standards, but for now
|
||||||
|
this is my personal active experiment.
|
||||||
|
|
||||||
|
Report bugs as you find them, but not knowing how to train ML models, setup an environment, or use python is not a bug.
|
||||||
|
I will make all of this more user-friendly eventually
|
||||||
|
|
||||||
|
I will make a better readme later.
|
||||||
|
|
||||||
## Installation
|
## Installation
|
||||||
|
|
||||||
I will try to update this to be more beginner-friendly, but for now I am assuming
|
Requirements:
|
||||||
a general understanding of python, pip, pytorch, and using virtual environments:
|
- python >3.10
|
||||||
|
- Nvidia GPU with enough ram to do what you need
|
||||||
|
- python venv
|
||||||
|
- git
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
Linux:
|
Linux:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
|
git clone https://github.com/ostris/ai-toolkit.git
|
||||||
|
cd ai-toolkit
|
||||||
git submodule update --init --recursive
|
git submodule update --init --recursive
|
||||||
pythion3 -m venv venv
|
python3 -m venv venv
|
||||||
source venv/bin/activate
|
source venv/bin/activate
|
||||||
pip install -r requirements.txt
|
# or source venv/Scripts/activate on windows
|
||||||
cd requirements/sd-scripts
|
pip3 install -r requirements.txt
|
||||||
pip install --no-deps -e .
|
|
||||||
cd ../..
|
|
||||||
```
|
```
|
||||||
|
|
||||||
Windows:
|
---
|
||||||
|
|
||||||
```bash
|
|
||||||
git submodule update --init --recursive
|
|
||||||
pythion3 -m venv venv
|
|
||||||
venv\Scripts\activate
|
|
||||||
pip install -r requirements.txt
|
|
||||||
cd requirements/sd-scripts
|
|
||||||
pip install --no-deps -e .
|
|
||||||
cd ../..
|
|
||||||
```
|
|
||||||
|
|
||||||
## Current Tools
|
## Current Tools
|
||||||
|
|
||||||
### LyCORIS extractor
|
I have so many hodge podge scripts I am going to be moving over to this that I use in my ML work. But this is what is
|
||||||
|
here so far.
|
||||||
|
|
||||||
It is similar to the [LyCORIS](https://github.com/KohakuBlueleaf/LyCORIS) tool, but adding some QOL features.
|
### LoRA (lierla), LoCON (LyCORIS) extractor
|
||||||
It all runs off a config file, which you can find an example of in `config/examples/locon_config.example.json`.
|
|
||||||
Just copy that file, into the `config` folder, and rename it to `whatever_you_want.json`.
|
It is based on the extractor in the [LyCORIS](https://github.com/KohakuBlueleaf/LyCORIS) tool, but adding some QOL features
|
||||||
|
and LoRA (lierla) support. It can do multiple types of extractions in one run.
|
||||||
|
It all runs off a config file, which you can find an example of in `config/examples/extract.example.yml`.
|
||||||
|
Just copy that file, into the `config` folder, and rename it to `whatever_you_want.yml`.
|
||||||
Then you can edit the file to your liking. and call it like so:
|
Then you can edit the file to your liking. and call it like so:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
python3 run.py "whatever_you_want"
|
python3 run.py config/whatever_you_want.yml
|
||||||
```
|
```
|
||||||
|
|
||||||
You can also put a full path to a config file, if you want to keep it somewhere else.
|
You can also put a full path to a config file, if you want to keep it somewhere else.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
python3 run.py "/home/user/whatever_you_want.json"
|
python3 run.py "/home/user/whatever_you_want.yml"
|
||||||
```
|
```
|
||||||
|
|
||||||
File name is auto generated and dumped into the `output` folder. You can put whatever meta you want in the
|
More notes on how it works are available in the example config file itself. LoRA and LoCON both support
|
||||||
`meta` section of the config file, and it will be added to the metadata of the output file. I just have
|
extractions of 'fixed', 'threshold', 'ratio', 'quantile'. I'll update what these do and mean later.
|
||||||
some recommended fields in the example file. The script will add some other useful metadata as well.
|
Most people used fixed, which is traditional fixed dimension extraction.
|
||||||
|
|
||||||
process is an array or different processes to run on the conversion to test. You will normally just need one though.
|
`process` is an array of different processes to run. You can add a few and mix and match. One LoRA, one LyCON, etc.
|
||||||
|
|
||||||
Will update this later.
|
|
||||||
|
### LoRA Slider Trainer
|
||||||
|
|
||||||
|
This is how I train most of the recent sliders I have on Civitai, you can check them out in my [Civitai profile](https://civitai.com/user/Ostris/models).
|
||||||
|
It is based off the work by [p1atdev/LECO](https://github.com/p1atdev/LECO) and [rohitgandikota/erasing](https://github.com/rohitgandikota/erasing)
|
||||||
|
But has been heavily modified to create sliders rather than erasing concepts. I have a lot more plans on this, but it is
|
||||||
|
very functional as is. It is also very easy to use. Just copy the example config file in `config/examples/train_slider.example.yml`
|
||||||
|
to the `config` folder and rename it to `whatever_you_want.yml`. Then you can edit the file to your liking. and call it like so:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
python3 run.py config/whatever_you_want.yml
|
||||||
|
```
|
||||||
|
|
||||||
|
There is a lot more information in that example file. You can even run the example as is without any modifications to see
|
||||||
|
how it works. It will create a slider that turns all animals into dogs(neg) or cats(pos). Just run it like so:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
python3 run.py config/examples/train_slider.example.yml
|
||||||
|
```
|
||||||
|
|
||||||
|
And you will be able to see how it works without configuring anything. No datasets are required for this method.
|
||||||
|
I will post an better tutorial soon.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## WIP Tools
|
||||||
|
|
||||||
|
|
||||||
|
### VAE (Variational Auto Encoder) Trainer
|
||||||
|
|
||||||
|
This works, but is not ready for others to use and therefore does not have an example config.
|
||||||
|
I am still working on it. I will update this when it is ready.
|
||||||
|
I am adding a lot of features for criteria that I have used in my image enlargement work. A Critic (discriminator),
|
||||||
|
content loss, style loss, and a few more. If you don't know, the VAE
|
||||||
|
for stable diffusion (yes even the MSE one, and SDXL), are horrible at smaller faces and it holds SD back. I will fix this.
|
||||||
|
I'll post more about this later with better examples later, but here is a quick test of a run through with various VAEs.
|
||||||
|
Just went in and out. It is much worse on smaller faces than shown here.
|
||||||
|
|
||||||
|
<img src="https://raw.githubusercontent.com/ostris/ai-toolkit/main/assets/VAE_test1.jpg" width="768" height="auto">
|
||||||
|
|
||||||
|
|||||||
BIN
assets/VAE_test1.jpg
Normal file
BIN
assets/VAE_test1.jpg
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 1.4 MiB |
@@ -47,7 +47,8 @@ config:
|
|||||||
- type: lora # traditional lora extraction (lierla) with linear layers only
|
- type: lora # traditional lora extraction (lierla) with linear layers only
|
||||||
filename: "[name]_4.safetensors"
|
filename: "[name]_4.safetensors"
|
||||||
mode: fixed # fixed, ratio, quantile supported for lora as well
|
mode: fixed # fixed, ratio, quantile supported for lora as well
|
||||||
linear: 4
|
linear: 4 # lora dim or rank
|
||||||
|
# no conv for lora
|
||||||
|
|
||||||
# process 5
|
# process 5
|
||||||
- type: lora
|
- type: lora
|
||||||
|
|||||||
@@ -33,7 +33,7 @@ config:
|
|||||||
# how many steps to train. More is not always better. I rarely go over 1000
|
# how many steps to train. More is not always better. I rarely go over 1000
|
||||||
steps: 500
|
steps: 500
|
||||||
# I have had good results with 4e-4 to 1e-4 at 500 steps
|
# I have had good results with 4e-4 to 1e-4 at 500 steps
|
||||||
lr: 2e-4
|
lr: 1e-4
|
||||||
# train the unet. I recommend leaving this true
|
# train the unet. I recommend leaving this true
|
||||||
train_unet: true
|
train_unet: true
|
||||||
# train the text encoder. I don't recommend this unless you have a special use case
|
# train the text encoder. I don't recommend this unless you have a special use case
|
||||||
@@ -70,7 +70,7 @@ config:
|
|||||||
# saving config
|
# saving config
|
||||||
save:
|
save:
|
||||||
dtype: float16 # precision to save. I recommend float16
|
dtype: float16 # precision to save. I recommend float16
|
||||||
save_every: 100 # save every this many steps
|
save_every: 50 # save every this many steps
|
||||||
|
|
||||||
# sampling config
|
# sampling config
|
||||||
sample:
|
sample:
|
||||||
@@ -90,7 +90,7 @@ config:
|
|||||||
# --n [string] # negative prompt, will inherit sample.neg if not set
|
# --n [string] # negative prompt, will inherit sample.neg if not set
|
||||||
|
|
||||||
# Only 75 tokens allowed currently
|
# Only 75 tokens allowed currently
|
||||||
prompts:
|
prompts: # our example is an animal slider, neg: dog, pos: cat
|
||||||
- "a golden retriever --m -5"
|
- "a golden retriever --m -5"
|
||||||
- "a golden retriever --m -3"
|
- "a golden retriever --m -3"
|
||||||
- "a golden retriever --m 3"
|
- "a golden retriever --m 3"
|
||||||
@@ -99,6 +99,10 @@ config:
|
|||||||
- "calico cat --m -3"
|
- "calico cat --m -3"
|
||||||
- "calico cat --m 3"
|
- "calico cat --m 3"
|
||||||
- "calico cat --m 5"
|
- "calico cat --m 5"
|
||||||
|
- "an elephant --m -5"
|
||||||
|
- "an elephant --m -3"
|
||||||
|
- "an elephant --m 3"
|
||||||
|
- "an elephant --m 5"
|
||||||
# negative prompt used on all prompts above as default if they don't have one
|
# negative prompt used on all prompts above as default if they don't have one
|
||||||
neg: "cartoon, fake, drawing, illustration, cgi, animated, anime, monochrome"
|
neg: "cartoon, fake, drawing, illustration, cgi, animated, anime, monochrome"
|
||||||
# seed for sampling. 42 is the answer for everything
|
# seed for sampling. 42 is the answer for everything
|
||||||
|
|||||||
Reference in New Issue
Block a user