ai-toolkit

ostris/ai-toolkit

Fork 0

mirror of https://github.com/ostris/ai-toolkit.git synced 2026-05-11 08:20:35 +00:00

Commit Graph

Select branches

Hide Pull Requests

WIP

dev

development

hidream_o1

kohya-sdxl

lumina2

main

multi-gpu

qwen_image_edit

sdxl

wavelet_loss

#107

#109

#115

#12

#120

#124

#125

#128

#138

#141

#143

#155

#156

#158

#160

#160

#168

#173

#176

#177

#179

#183

#184

#185

#191

#196

#224

#226

#232

#233

#244

#245

#246

#249

#25

#251

#253

#253

#256

#263

#264

#265

#271

#273

#274

#275

#278

#283

#290

#300

#302

#302

#303

#304

#307

#318

#326

#326

#331

#333

#341

#342

#343

#350

#353

#354

#359

#362

#373

#375

#377

#381

#382

#383

#385

#392

#392

#395

#407

#41

#417

#418

#420

#422

#426

#427

#430

#431

#431

#432

#434

#442

#446

#447

#452

#455

#459

#461

#464

#465

#467

#469

#473

#474

#488

#489

#492

#494

#496

#498

#499

#500

#509

#511

#513

#519

#520

#521

#522

#523

#525

#527

#540

#547

#551

#555

#558

#559

#563

#563

#564

#564

#567

#578

#580

#581

#583

#594

#599

#60

#600

#601

#601

#606

#608

#61

#610

#611

#614

#616

#629

#630

#633

#637

#639

#644

#646

#648

#649

#655

#658

#660

#661

#666

#666

#668

#668

#673

#682

#685

#687

#69

#692

#694

#694

#695

#697

#698

#699

#7

#702

#703

#705

#706

#707

#708

#709

#709

#713

#713

#717

#718

#718

#719

#720

#721

#724

#724

#726

#727

#727

#732

#732

#733

#738

#742

#743

#743

#745

#748

#749

#754

#759

#760

#761

#762

#763

#765

#765

#766

#766

#767

#768

#769

#770

#774

#774

#777

#777

#778

#778

#782

#782

#783

#785

#793

#80

#800

#803

#803

#805

#805

#807

#807

#808

#808

#809

#811

#811

#812

#812

#813

#813

#814

#814

#819

#819

#825

#825

#827

#828

#828

#829

#829

#86

#9

#95

0b8a32def7 merged in lumina2 branch Jaret Burkett 2025-02-12 09:33:03 -07:00
787bb37e76 Small fixed for DFE, polar guidance, and other things Jaret Burkett 2025-02-12 09:27:44 -07:00
10aa7e9d5e Fixed some breaking changes with diffusers gradient checkpointing. Jaret Burkett 2025-02-10 09:35:31 -07:00
ed1deb71c4 Added examples for training lumina2 lumina2 Jaret Burkett 2025-02-08 16:13:18 -07:00
4de6a825fa Update lumina requirements Jaret Burkett 2025-02-08 15:16:35 -07:00
9a7266275d Wokr on lumina2 Jaret Burkett 2025-02-08 14:52:39 -07:00
d138f07365 Imitial lumina3 support Jaret Burkett 2025-02-08 10:59:53 -07:00
c6d8eedb94 Added ability to use consistent noise for each image in a dataset by hashing the path and using that as a seed. Jaret Burkett 2025-02-08 07:13:48 -07:00
af5e760be1 Merge pull request #249 from ostris/accelerate-multi-gpu Jaret Burkett 2025-02-08 07:11:10 -07:00
ff3d54bb5b Make the mean of the mask multiplier be 1.0 for a more balanced loss. Jaret Burkett 2025-02-06 10:57:06 +00:00
0e75724b4d Lock version of diffusers Jaret Burkett 2025-02-05 18:14:59 +00:00
376bb1bf6f Lock torch version due to breaking changes Jaret Burkett 2025-02-04 23:00:57 +00:00
216ab164ce Experimental features and bug fixes Jaret Burkett 2025-02-04 13:36:34 -07:00
e6180d1e1d Bug fixes Jaret Burkett 2025-01-31 13:23:01 -07:00
15a57bc89f Add new version of DFE. Kitchen sink Jaret Burkett 2025-01-31 11:42:27 -07:00
e5355bf8d5 Added train catch to the blank network Jaret Burkett 2025-01-30 16:15:45 +00:00
34a1c6947a Added flux_shift as timestep type Jaret Burkett 2025-01-27 07:35:00 -07:00
2141c6e06c Merge remote-tracking branch 'origin/main' into accelerate-multi-gpu Jaret Burkett 2025-01-26 11:19:34 -07:00
1188cf1e8a Adjust flux sample sampler to handle some new breaking changes in diffusers. Jaret Burkett 2025-01-26 18:09:21 +00:00
5e663746b8 Working multi gpu training. Still need a lot of tweaks and testing. Jaret Burkett 2025-01-25 16:46:20 -07:00
441474e81f Added a flag to lora extraction script to do a full transformer extraction. Jaret Burkett 2025-01-24 09:34:13 -07:00
a6a690f796 Update full fine tune example to only train transformer blocks. Jaret Burkett 2025-01-24 09:28:34 -07:00
6191f19e55 Added script to convert diffusers model to ComfyUI variant Jaret Burkett 2025-01-23 21:30:23 -07:00
bbfba0c188 Added v2 of dfp Jaret Burkett 2025-01-22 16:32:13 -07:00
e1549ad54d Update dfe model arch Jaret Burkett 2025-01-22 10:37:23 -07:00
04abe57c76 Added weighing to DFE Jaret Burkett 2025-01-22 08:50:57 -07:00
89dd041b97 Added ability to pair samples with a closer noise with optimal_noise_pairing_samples Jaret Burkett 2025-01-21 18:30:10 -07:00
29122b1a54 Added code to handle diffusion feature extraction loss Jaret Burkett 2025-01-21 14:21:34 -07:00
6a8e3d8610 Added a config file for full finetuning flex. Added a lora extraction script for flex Jaret Burkett 2025-01-20 10:09:01 -07:00
4c8a9e1b88 Added example config to train Flex Jaret Burkett 2025-01-18 18:03:20 -07:00
fadb2f3a76 Allow quantizing the te independently on flux. added lognorm_blend timestep schedule Jaret Burkett 2025-01-18 18:02:31 -07:00
4723f23c0d Added ability to split up flux across gpus (experimental). Changed the way timestep scheduling works to prep for more specific schedules. Jaret Burkett 2024-12-31 07:06:55 -07:00
8ef07a9c36 Added training for an experimental decoratgor embedding. Allow for turning off guidance embedding on flux (for unreleased model). Various bug fixes and modifications Jaret Burkett 2024-12-15 08:59:27 -07:00
92ce93140e Adjustments to defaults for automagic Jaret Burkett 2024-11-29 10:28:06 -07:00
f213996aa5 Fixed saving and displaying for automagic Jaret Burkett 2024-11-29 08:00:22 -07:00
cbe31eaf0a Initial work on a auto adjusting optimizer Jaret Burkett 2024-11-29 04:48:58 -07:00
67c2e44edb Added support for training flux redux adapters Jaret Burkett 2024-11-21 20:01:52 -07:00
96d418bb95 Added support for full finetuning flux with randomized param activation. Examples coming soon Jaret Burkett 2024-11-21 13:05:32 -07:00
894374b2e9 Various bug fixes and optimizations for quantized training. Added untested custom adam8bit optimizer. Did some work on LoRM (dont use) Jaret Burkett 2024-11-20 09:16:55 -07:00
6509ba4484 Fix seed generation to make it deterministic so it is consistant from gpu to gpu Jaret Burkett 2024-11-15 12:11:13 -07:00
025ee3dd3d Added ability for adafactor to fully fine tune quantized model. Jaret Burkett 2024-10-30 16:38:07 -06:00
58f9d01c2b Added adafactor implementation that handles stochastic rounding of update and accumulation Jaret Burkett 2024-10-30 05:25:57 -06:00
e72b59a8e9 Added experimental 8bit version of prodigy with stochastic rounding and stochastic gradient accumulation. Still testing. Jaret Burkett 2024-10-29 14:28:28 -06:00
4aa19b5c1d Only quantize flux T5 is also quantizing model. Load TE from original name and path if fine tuning. Jaret Burkett 2024-10-29 14:25:31 -06:00
4747716867 Fixed issue with adapters not providing gradients with new grad activator Jaret Burkett 2024-10-29 14:22:10 -06:00
22cd40d7b9 Improvements for full tuning flux. Added debugging launch config for vscode Jaret Burkett 2024-10-29 04:54:08 -06:00
3400882a80 Added preliminary support for SD3.5-large lora training Jaret Burkett 2024-10-22 12:21:36 -06:00
9f94c7b61e Added experimental param multiplier to the ema module Jaret Burkett 2024-10-22 09:25:52 -06:00
bedb8197a2 Fixed issue with sizes for some images being loaded sideways resulting in squished images. Jaret Burkett 2024-10-20 11:51:29 -06:00
e3ebd73610 Add a projection layer on vision direct when doing image embeds Jaret Burkett 2024-10-20 10:48:23 -06:00
dd931757cd Merge branch 'main' of github.com:ostris/ai-toolkit Jaret Burkett 2024-10-20 07:04:29 -06:00
0640cdf569 Handle errors in loading size database Jaret Burkett 2024-10-20 07:04:19 -06:00
0b048d0dde Locked version of quanto as it breaks in later versions Jaret Burkett 2024-10-16 22:41:04 +00:00
473d455f44 Process empty clip image if there is not one for reg images when training a custom adapter Jaret Burkett 2024-10-15 08:28:04 -06:00
ce759ebd8c Normalize the image embeddings on vd adapter forward Jaret Burkett 2024-10-12 15:09:48 +00:00
628a7923a3 Remove norm on image embeds on custom adapter Jaret Burkett 2024-10-12 00:43:18 +00:00
3922981996 Added some additional experimental things to the vision direct encoder Jaret Burkett 2024-10-10 19:42:26 +00:00
ab22674980 Allow for a default caption file in the folder. Minor bug fixes. Jaret Burkett 2024-10-10 07:31:33 -06:00
9452929300 Apply a mask to the embeds for SD if using T5 encoder Jaret Burkett 2024-10-04 10:55:20 -06:00
a800c9d19e Add a method to have an inference only lora Jaret Burkett 2024-10-04 10:06:53 -06:00
28e6f00790 Fixed bug in returning clip image embed to actually return it Jaret Burkett 2024-10-03 10:49:09 -06:00
67e0aca750 Added ability to load clip pairs randomly from folder. Other small bug fixes Jaret Burkett 2024-10-03 10:03:49 -06:00
f05224970f Added Vision Languate Adapter usage for pixtral vd adapter Jaret Burkett 2024-09-29 19:39:56 -06:00
b4f64de4c2 Quick patch to scope xformer imports until a better solution Jaret Burkett 2024-09-28 15:36:42 -06:00
2e5f6668dc Add xformers ad a dependency Jaret Burkett 2024-09-28 15:30:14 -06:00
e4c82803e1 Handle random resizing for pixtral input on direct vision adapter Jaret Burkett 2024-09-28 14:53:38 -06:00
69aa92bce5 Added support for AdEMAMix8bit Jaret Burkett 2024-09-28 14:33:51 -06:00
a508caad1d Change pixtral to crop based on number of pixels instead of largest dimension Jaret Burkett 2024-09-28 13:05:26 -06:00
58537fc92b Added initial direct vision pixtral support Jaret Burkett 2024-09-28 10:47:51 -06:00
86b5938cf3 Fixed the webp bug finally. Jaret Burkett 2024-09-25 13:56:00 -06:00
6b4034122f REmove layers from direct vision resampler Jaret Burkett 2024-09-24 15:08:29 -06:00
10817696fb Fixed issue where direct vision was not passing additional params from resampler when it is added Jaret Burkett 2024-09-24 10:34:11 -06:00
037ce11740 Always return vision encoder in state dict Jaret Burkett 2024-09-24 07:43:17 -06:00
04424fe2d6 Added config setting to set the timestep type Jaret Burkett 2024-09-24 06:53:59 -06:00
40a8ff5731 Load local hugging face packages for assistant adapter Jaret Burkett 2024-09-23 10:37:12 -06:00
2776221497 Added option to cache empty prompt or trigger and unload text encoders while training Jaret Burkett 2024-09-21 20:54:09 -06:00
f85ad452c6 Added initial support for pixtral vision as a vision encoder. Jaret Burkett 2024-09-21 15:21:14 -06:00
dd889086f4 Updates to the docker file for jupyterlab Jaret Burkett 2024-09-21 12:07:07 -06:00
bc693488eb fix diffusers codebase (#183) apolinário 2024-09-21 12:50:29 -05:00
d97c55cd96 Updated requirements to lock version of albucore, which had breaking changes. Jaret Burkett 2024-09-21 11:19:13 -06:00
79b4e04b80 Feat: Wandb logging (#95) Plat 2024-09-20 11:01:01 +09:00
951e223481 Added support to disable single transformers in vision direct adapter Jaret Burkett 2024-09-11 08:54:51 -06:00
fc34a69bec Ignore guidance embed when full tuning flux. adjust block scaler to decat to 1.0. Add MLP resampler for reducing vision adapter tokens Jaret Burkett 2024-09-09 16:24:46 -06:00
279ee65177 Remove block scaler Jaret Burkett 2024-09-06 08:28:17 -06:00
3a1f464132 Added support for training vision direct weight adapters Jaret Burkett 2024-09-05 10:11:44 -06:00
5c8fcc8a4e Fix bug with zeroing out gradients when accumulating Jaret Burkett 2024-09-03 08:29:15 -06:00
121a760c19 Added proper grad accumulation Jaret Burkett 2024-09-03 07:24:18 -06:00
e5fadddd45 Added ability to do prompt attn masking for flux Jaret Burkett 2024-09-02 17:29:36 -06:00
d44d4eb61a Added a new experimental linear weighing technique Jaret Burkett 2024-09-02 09:22:13 -06:00
7d9ab22405 Rework ip adapter and vision direct adapters to apply to the single transformer blocks even though they are not cross attn. Jaret Burkett 2024-09-01 10:40:42 -06:00
7ed8c51f20 Readme cleanup Jaret Burkett 2024-09-01 07:06:09 -06:00
6df33156f0 Add information about specific weight targeting in the README Jaret Burkett 2024-09-01 06:59:47 -06:00
40f5c59da0 Fixes for training ilora on flux Jaret Burkett 2024-08-31 16:55:26 -06:00
3e71a99df0 Check for contains only against clean name for lora, not the adjusted one Jaret Burkett 2024-08-31 07:44:13 -06:00
562405923f Update README.md for push_to_hub (#143) apolinário 2024-08-30 17:34:28 -05:00
f84bd6d7a6 Add Gradio UI for ai-toolkit (#141) apolinário 2024-08-30 07:29:51 -05:00
4fa8fac5fd WIP multidevice training multi-gpu Jaret Burkett 2024-08-29 16:04:20 -06:00
a48c9aba8d Created a v2 trainer and moved all the training logic to single torch model so it can can be run in parallel Jaret Burkett 2024-08-29 12:34:18 -06:00
60232def91 Made peleminary arch for flux ip adapter training Jaret Burkett 2024-08-28 08:55:39 -06:00
3843e0d148 Added support for vision direct adapter for flux Jaret Burkett 2024-08-26 16:27:28 -06:00

... 5 6 7 8 9 ...