ai-toolkit

mirror of https://github.com/ostris/ai-toolkit.git synced 2026-03-13 06:29:48 +00:00

Author	SHA1	Message	Date
Jaret Burkett	bd8d7dc081	fixed various issues with llm attention masking. Added block training on the llm adapter.	2025-02-14 11:24:01 -07:00
Jaret Burkett	2be6926398	Added back syustem prompt for llm and remove those tokens from the embeddings	2025-02-14 07:23:37 -07:00
Jaret Burkett	87ac031859	Remove system prompt, shouldnt be necessary fo rhow it works.	2025-02-13 08:42:48 -07:00
Jaret Burkett	7679105d52	Added llm text encoder adapter	2025-02-13 08:28:32 -07:00
Jaret Burkett	2622de1e01	DFE tweaks. Adding support for more llms as text encoders	2025-02-13 04:31:49 -07:00
Jaret Burkett	8450aca10e	Fixed missed merge conflice and locked diffusers version	2025-02-12 09:40:02 -07:00
Jaret Burkett	0b8a32def7	merged in lumina2 branch	2025-02-12 09:33:03 -07:00
Jaret Burkett	787bb37e76	Small fixed for DFE, polar guidance, and other things	2025-02-12 09:27:44 -07:00
Jaret Burkett	10aa7e9d5e	Fixed some breaking changes with diffusers gradient checkpointing.	2025-02-10 09:35:31 -07:00
Jaret Burkett	ed1deb71c4	Added examples for training lumina2	2025-02-08 16:13:18 -07:00
Jaret Burkett	4de6a825fa	Update lumina requirements	2025-02-08 15:16:35 -07:00
Jaret Burkett	9a7266275d	Wokr on lumina2	2025-02-08 14:52:39 -07:00
Jaret Burkett	d138f07365	Imitial lumina3 support	2025-02-08 10:59:53 -07:00
Jaret Burkett	c6d8eedb94	Added ability to use consistent noise for each image in a dataset by hashing the path and using that as a seed.	2025-02-08 07:13:48 -07:00
Jaret Burkett	af5e760be1	Merge pull request #249 from ostris/accelerate-multi-gpu Multi gpu support. Other goodies	2025-02-08 07:11:10 -07:00
Jaret Burkett	ff3d54bb5b	Make the mean of the mask multiplier be 1.0 for a more balanced loss.	2025-02-06 10:57:06 +00:00
Jaret Burkett	0e75724b4d	Lock version of diffusers	2025-02-05 18:14:59 +00:00
Jaret Burkett	376bb1bf6f	Lock torch version due to breaking changes	2025-02-04 23:00:57 +00:00
Jaret Burkett	216ab164ce	Experimental features and bug fixes	2025-02-04 13:36:34 -07:00
Jaret Burkett	e6180d1e1d	Bug fixes	2025-01-31 13:23:01 -07:00
Jaret Burkett	15a57bc89f	Add new version of DFE. Kitchen sink	2025-01-31 11:42:27 -07:00
Jaret Burkett	e5355bf8d5	Added train catch to the blank network	2025-01-30 16:15:45 +00:00
Jaret Burkett	34a1c6947a	Added flux_shift as timestep type	2025-01-27 07:35:00 -07:00
Jaret Burkett	2141c6e06c	Merge remote-tracking branch 'origin/main' into accelerate-multi-gpu	2025-01-26 11:19:34 -07:00
Jaret Burkett	1188cf1e8a	Adjust flux sample sampler to handle some new breaking changes in diffusers.	2025-01-26 18:09:21 +00:00
Jaret Burkett	5e663746b8	Working multi gpu training. Still need a lot of tweaks and testing.	2025-01-25 16:46:20 -07:00
Jaret Burkett	441474e81f	Added a flag to lora extraction script to do a full transformer extraction.	2025-01-24 09:34:13 -07:00
Jaret Burkett	a6a690f796	Update full fine tune example to only train transformer blocks.	2025-01-24 09:28:34 -07:00
Jaret Burkett	6191f19e55	Added script to convert diffusers model to ComfyUI variant	2025-01-23 21:30:23 -07:00
Jaret Burkett	bbfba0c188	Added v2 of dfp	2025-01-22 16:32:13 -07:00
Jaret Burkett	e1549ad54d	Update dfe model arch	2025-01-22 10:37:23 -07:00
Jaret Burkett	04abe57c76	Added weighing to DFE	2025-01-22 08:50:57 -07:00
Jaret Burkett	89dd041b97	Added ability to pair samples with a closer noise with optimal_noise_pairing_samples	2025-01-21 18:30:10 -07:00
Jaret Burkett	29122b1a54	Added code to handle diffusion feature extraction loss	2025-01-21 14:21:34 -07:00
Jaret Burkett	6a8e3d8610	Added a config file for full finetuning flex. Added a lora extraction script for flex	2025-01-20 10:09:01 -07:00
Jaret Burkett	4c8a9e1b88	Added example config to train Flex	2025-01-18 18:03:20 -07:00
Jaret Burkett	fadb2f3a76	Allow quantizing the te independently on flux. added lognorm_blend timestep schedule	2025-01-18 18:02:31 -07:00
Jaret Burkett	4723f23c0d	Added ability to split up flux across gpus (experimental). Changed the way timestep scheduling works to prep for more specific schedules.	2024-12-31 07:06:55 -07:00
Jaret Burkett	8ef07a9c36	Added training for an experimental decoratgor embedding. Allow for turning off guidance embedding on flux (for unreleased model). Various bug fixes and modifications	2024-12-15 08:59:27 -07:00
Jaret Burkett	92ce93140e	Adjustments to defaults for automagic	2024-11-29 10:28:06 -07:00
Jaret Burkett	f213996aa5	Fixed saving and displaying for automagic	2024-11-29 08:00:22 -07:00
Jaret Burkett	cbe31eaf0a	Initial work on a auto adjusting optimizer	2024-11-29 04:48:58 -07:00
Jaret Burkett	67c2e44edb	Added support for training flux redux adapters	2024-11-21 20:01:52 -07:00
Jaret Burkett	96d418bb95	Added support for full finetuning flux with randomized param activation. Examples coming soon	2024-11-21 13:05:32 -07:00
Jaret Burkett	894374b2e9	Various bug fixes and optimizations for quantized training. Added untested custom adam8bit optimizer. Did some work on LoRM (dont use)	2024-11-20 09:16:55 -07:00
Jaret Burkett	6509ba4484	Fix seed generation to make it deterministic so it is consistant from gpu to gpu	2024-11-15 12:11:13 -07:00
Jaret Burkett	025ee3dd3d	Added ability for adafactor to fully fine tune quantized model.	2024-10-30 16:38:07 -06:00
Jaret Burkett	58f9d01c2b	Added adafactor implementation that handles stochastic rounding of update and accumulation	2024-10-30 05:25:57 -06:00
Jaret Burkett	e72b59a8e9	Added experimental 8bit version of prodigy with stochastic rounding and stochastic gradient accumulation. Still testing.	2024-10-29 14:28:28 -06:00
Jaret Burkett	4aa19b5c1d	Only quantize flux T5 is also quantizing model. Load TE from original name and path if fine tuning.	2024-10-29 14:25:31 -06:00

1 2 3 4 5 ...

508 Commits