ai-toolkit

mirror of https://github.com/ostris/ai-toolkit.git synced 2026-01-26 16:39:47 +00:00

Author	SHA1	Message	Date
Jaret Burkett	193c1b2dfa	Add a watcher to constantly check for stop signal from the UI. This will force a stop within 2 seconds instead of having to wait on a long hung process.	2025-08-31 16:58:01 -06:00
Jaret Burkett	056711d4ed	Fix issue with wan22 14b that woudl load both transformers temporarily resulting in oom on 24GB.	2025-08-28 13:06:31 -06:00
Jaret Burkett	1f541bc5d8	Changes to handle a different DFE arch	2025-08-27 11:05:16 -06:00
Jaret Burkett	119653c3f2	Force width, height, and num frames to always be the proper sizes for Wan 2.2 models	2025-08-25 10:33:28 -06:00
Jaret Burkett	e1fd411665	Added support for Chroma1 official release. Will still use single file verstion instead of the diffusers version.	2025-08-23 09:06:28 -06:00
Jaret Burkett	aa99784b89	Add control to prompot encodings in the trainer when not cached	2025-08-21 16:52:13 -06:00
Jaret Burkett	bf2700f7be	Initial support for finetuning qwen image. Will only work with caching for now, need to add controls everywhere.	2025-08-21 16:41:17 -06:00
Jaret Burkett	83deaec417	Minor bug fixes	2025-08-21 08:05:34 -06:00
Jaret Burkett	d2bbe1872c	Add support for fine tuning Wan 2.2 I2V 14B	2025-08-18 11:43:32 -06:00
Jaret Burkett	b3e666daf4	Fix issue with wan22 14b where timesteps were generated not in the current boundary.	2025-08-16 21:16:48 -06:00
Jaret Burkett	6fffadfc0e	Fixed a bug that prevented training just one stage of Wan 2.2 14b	2025-08-16 18:07:21 -06:00
Jaret Burkett	8ea2cf00f6	Added training to the ui. Still testing, but everything seems to be working.	2025-08-16 05:51:37 -06:00
Jaret Burkett	1c96b95617	Fix issue where sometimes the transformer does not get loaded properly.	2025-08-14 14:24:41 -06:00
Jaret Burkett	3413fa537f	Wan22 14b training is working, still need tons of testing and some bug fixes	2025-08-14 13:03:27 -06:00
Jaret Burkett	259d68d440	Added a flushg during sampling to prevent spikes on low vram qwen	2025-08-12 12:57:18 -06:00
Jaret Burkett	77b10d884d	Add support for training with an accuracy recovery adapter with qwen image	2025-08-12 08:21:36 -06:00
Jaret Burkett	4ad18f3d00	Clip max token embeddings to the max rope length for qwen image to solve for an error for super long captions > 1024	2025-08-10 08:44:41 -06:00
Jaret Burkett	f0105c33a7	Fixed issue that sometimes happens in qwen image where text seq length is wrong	2025-08-09 16:33:05 -06:00
Jaret Burkett	bb6db3d635	Added support for caching text embeddings. This is just initial support and will probably fail for some models. Still needs to be ompimized	2025-08-07 10:27:55 -06:00
Jaret Burkett	4c4a10d439	Remove vision model from qwen text encoder as it is not needed for image generation currently	2025-08-06 11:40:02 -06:00
Jaret Burkett	14ccf2f3ce	Refactor qwen5b model code to be qwen 5b specific	2025-08-06 10:54:56 -06:00
Jaret Burkett	5d8922fca2	Add ability to designate a dataset as i2v or t2v for models that support it	2025-08-06 09:29:47 -06:00
Jaret Burkett	93202c7a2b	Training working for Qwen Image	2025-08-04 21:14:30 +00:00
Jaret Burkett	9da8b5408e	Initial but untested support for qwen_image	2025-08-04 13:29:37 -06:00
Jaret Burkett	a558d5b68f	Move transformer back to device on aggresive wan 2.2 pipeline after generation.	2025-07-29 09:13:47 -06:00
Jaret Burkett	1d1199b15b	Fix bug that prevented training wan 2.2 with batch size greater than 1	2025-07-29 09:06:25 -06:00
Jaret Burkett	ca7c5c950b	Add support for Wan2.2 5B	2025-07-29 05:31:54 -06:00
Jaret Burkett	cefa2ca5fe	Added initial support for Hidream E1 training	2025-07-27 15:12:56 -06:00
Daniel Verdu	a77ba5a089	fix: Guidance incorrect shape	2025-07-18 12:49:18 +02:00
Jaret Burkett	611969ec1f	Allow control image for omnigen training and sampling	2025-07-09 13:54:55 -06:00
Jaret Burkett	bbb57de6ec	Speed up omnigen TE loading	2025-07-05 09:32:00 -06:00
Jaret Burkett	5906a76666	Fixed issue with flux kontext forcing generation image sizes	2025-06-29 05:38:20 -06:00
Jaret Burkett	57a81bc0db	Update base model version for kontext meta	2025-06-28 14:48:36 -06:00
Jaret Burkett	01a3c8a9b1	Fix device issue	2025-06-26 19:14:25 -06:00
Jaret Burkett	4f91cb7148	Fix issue with gradient checkpointing and flux kontext	2025-06-26 19:03:12 -06:00
Jaret Burkett	446b0b6989	Remove revision for kontext	2025-06-26 16:46:58 -06:00
Jaret Burkett	60ef2f1df7	Added support for FLUX.1-Kontext-dev	2025-06-26 15:24:37 -06:00
Jaret Burkett	8d9c47316a	Work on mean flow. Minor bug fixes. Omnigen improvements	2025-06-26 13:46:20 -06:00
Jaret Burkett	84c6edca7e	Merge branch 'main' into dev	2025-06-25 14:10:25 -06:00
Jaret Burkett	19ea8ecc38	Added support for finetuning OmniGen2.	2025-06-25 13:58:16 -06:00
Jaret Burkett	18513ec866	Merged in from main	2025-06-24 10:56:54 -06:00
Jaret Burkett	f3eb1dff42	Add a config flag to trigger fast image size db builder. Add config flag to set unconditional prompt for guidance loss	2025-06-24 08:51:29 -06:00
Jaret Burkett	ba1274d99e	Added a guidance burning loss. Modified DFE to work with new model. Bug fixes	2025-06-23 08:38:27 -06:00
Jaret Burkett	8602470952	Updated diffusion feature extractor	2025-06-19 15:36:10 -06:00
Jaret Burkett	1cc663a664	Performance optimizations for pre processing the batch	2025-06-17 07:37:41 -06:00
Jaret Burkett	1c2b7298dd	More work on mean flow loss. Moved it to an adapter. Still not functioning properly though.	2025-06-16 07:17:35 -06:00
Jaret Burkett	c0314ba325	Fixed some issues with training mean flow algo. Still testing WIP	2025-06-16 07:14:59 -06:00
Jaret Burkett	cbf04b8d53	Fixed some issues with training mean flow algo. Still testing WIP	2025-06-14 12:24:00 -06:00
Jaret Burkett	fc83eb7691	WIP on mean flow loss. Still a WIP.	2025-06-12 08:00:51 -06:00
Jaret Burkett	eefa93f16e	Various code to support experiments.	2025-06-09 11:19:21 -06:00

1 2 3 4 5 ...

257 Commits