Modal cloud training support, fixed typo in toolkit/scheduler.py, Schnell training support for Colab, issue #92 , issue #114 (#115)

* issue #76, load_checkpoint_and_dispatch() 'force_hooks'

https://github.com/ostris/ai-toolkit/issues/76

* RunPod cloud config

https://github.com/ostris/ai-toolkit/issues/90

* change 2x A40 to 1x A40 and price per hour

referring to https://github.com/ostris/ai-toolkit/issues/90#issuecomment-2294894929

* include missed FLUX.1-schnell setup guide in last commit

* huggingface-cli login required auth

* #92 peft, #114 colab, schnell training in colab

* modal cloud - run_modal.py and .yaml configs

* run_modal.py mount path example

* modal_examples renamed to modal

* Training in Modal README.md setup guide

* rename run command in title for consistency
This commit is contained in:
martintomov
2024-08-23 06:25:44 +03:00
committed by GitHub
parent 4d35a29c97
commit 34db804c76
8 changed files with 817 additions and 89 deletions

View File

@@ -1,53 +1,45 @@
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"provenance": [],
"machine_shape": "hm",
"gpuType": "A100"
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
},
"language_info": {
"name": "python"
},
"accelerator": "GPU"
},
"cells": [
{
"cell_type": "markdown",
"source": [
"# AI Toolkit by Ostris\n",
"## FLUX.1 Training\n"
],
"metadata": {
"collapsed": false,
"id": "zl-S0m3pkQC5"
}
},
"source": [
"# AI Toolkit by Ostris\n",
"## FLUX.1-dev Training\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!git clone https://github.com/ostris/ai-toolkit\n",
"!mkdir -p /content/dataset"
],
"!nvidia-smi"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "BvAG0GKAh59G"
},
"execution_count": null,
"outputs": []
"outputs": [],
"source": [
"!git clone https://github.com/ostris/ai-toolkit\n",
"!mkdir -p /content/dataset"
]
},
{
"cell_type": "markdown",
"source": [
"Put your image dataset in the `/content/dataset` folder"
],
"metadata": {
"id": "UFUW4ZMmnp1V"
}
},
"source": [
"Put your image dataset in the `/content/dataset` folder"
]
},
{
"cell_type": "code",
@@ -62,6 +54,9 @@
},
{
"cell_type": "markdown",
"metadata": {
"id": "OV0HnOI6o8V6"
},
"source": [
"## Model License\n",
"Training currently only works with FLUX.1-dev. Which means anything you train will inherit the non-commercial license. It is also a gated model, so you need to accept the license on HF before using it. Otherwise, this will fail. Here are the required steps to setup a license.\n",
@@ -69,13 +64,15 @@
"Sign into HF and accept the model access here [black-forest-labs/FLUX.1-dev](https://huggingface.co/black-forest-labs/FLUX.1-dev)\n",
"\n",
"[Get a READ key from huggingface](https://huggingface.co/settings/tokens/new?) and place it in the next cell after running it."
],
"metadata": {
"id": "OV0HnOI6o8V6"
}
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "3yZZdhFRoj2m"
},
"outputs": [],
"source": [
"import getpass\n",
"import os\n",
@@ -87,15 +84,15 @@
"os.environ['HF_TOKEN'] = hf_token\n",
"\n",
"print(\"HF_TOKEN environment variable has been set.\")"
],
"metadata": {
"id": "3yZZdhFRoj2m"
},
"execution_count": null,
"outputs": []
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "9gO2EzQ1kQC8"
},
"outputs": [],
"source": [
"import os\n",
"import sys\n",
@@ -105,26 +102,26 @@
"from PIL import Image\n",
"import os\n",
"os.environ[\"HF_HUB_ENABLE_HF_TRANSFER\"] = \"1\""
],
"metadata": {
"id": "9gO2EzQ1kQC8"
},
"outputs": [],
"execution_count": null
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "N8UUFzVRigbC"
},
"source": [
"## Setup\n",
"\n",
"This is your config. It is documented pretty well. Normally you would do this as a yaml file, but for colab, this will work. This will run as is without modification, but feel free to edit as you want."
],
"metadata": {
"id": "N8UUFzVRigbC"
}
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "_t28QURYjRQO"
},
"outputs": [],
"source": [
"from collections import OrderedDict\n",
"\n",
@@ -169,7 +166,7 @@
" ]),\n",
" ('train', OrderedDict([\n",
" ('batch_size', 1),\n",
" ('steps', 4000), # total number of steps to train 500 - 4000 is a good range\n",
" ('steps', 2000), # total number of steps to train 500 - 4000 is a good range\n",
" ('gradient_accumulation_steps', 1),\n",
" ('train_unet', True),\n",
" ('train_text_encoder', False), # probably won't work with flux\n",
@@ -177,9 +174,16 @@
" ('gradient_checkpointing', True), # need the on unless you have a ton of vram\n",
" ('noise_scheduler', 'flowmatch'), # for training only\n",
" ('optimizer', 'adamw8bit'),\n",
" ('lr', 4e-4),\n",
" ('lr', 1e-4),\n",
"\n",
" # uncomment this to skip the pre training sample\n",
" #('skip_first_sample', True),\n",
" # ('skip_first_sample', True),\n",
"\n",
" # uncomment to completely disable sampling\n",
" # ('disable_sampling', True),\n",
"\n",
" # uncomment to use new vell curved weighting. Experimental but may produce better results\n",
" # ('linear_timesteps', True),\n",
"\n",
" # ema will smooth out learning, but could slow it down. Recommended to leave on.\n",
" ('ema_config', OrderedDict([\n",
@@ -231,45 +235,57 @@
" ('version', '1.0')\n",
" ]))\n",
"])\n"
],
"metadata": {
"id": "_t28QURYjRQO"
},
"execution_count": null,
"outputs": []
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "h6F1FlM2Wb3l"
},
"source": [
"## Run it\n",
"\n",
"Below does all the magic. Check your folders to the left. Items will be in output/LoRA/your_name_v1 In the samples folder, there are preiodic sampled. This doesnt work great with colab. They will be in /content/output"
],
"metadata": {
"id": "h6F1FlM2Wb3l"
}
]
},
{
"cell_type": "code",
"source": [
"run_job(job_to_run)\n"
],
"execution_count": null,
"metadata": {
"id": "HkajwI8gteOh"
},
"execution_count": null,
"outputs": []
"outputs": [],
"source": [
"run_job(job_to_run)\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Hblgb5uwW5SD"
},
"source": [
"## Done\n",
"\n",
"Check your ourput dir and get your slider\n"
],
"metadata": {
"id": "Hblgb5uwW5SD"
}
]
}
]
}
],
"metadata": {
"accelerator": "GPU",
"colab": {
"gpuType": "A100",
"machine_shape": "hm",
"provenance": []
},
"kernelspec": {
"display_name": "Python 3",
"name": "python3"
},
"language_info": {
"name": "python"
}
},
"nbformat": 4,
"nbformat_minor": 0
}