ai-toolkit

mirror of https://github.com/ostris/ai-toolkit.git synced 2026-05-11 16:30:40 +00:00

Author	SHA1	Message	Date
Jaret Burkett	e9ab387dfd	Fixed issue with qwen image edit models when using multiple control images when not caching text embeddings.	2026-04-30 11:59:12 +00:00
Jaret Burkett	deb409085a	Made it possible to use the flux 2 small decoder VAE when setting the vae_path manually for flux2 models	2026-04-30 05:42:31 -06:00
Jaret Burkett	7ccec8ec2c	Add checkpointing and a proper decode for flux 2 VAEs so they can be used with DFE	2026-04-30 04:27:13 -06:00
Jaret Burkett	7c4f18ce51	Fix ernie unpatchify	2026-04-17 12:03:39 -06:00
Jaret Burkett	8cb9649382	Add decode latent to ernie pipe	2026-04-17 12:01:09 -06:00
Jaret Burkett	ab1ee4df34	Hotfix some issues with Wan models caused by diffusers and transformers updates	2026-04-16 20:53:50 +00:00
Jaret Burkett	afb62b1fa5	Add support for Nucleus-Image	2026-04-16 13:09:10 -06:00
Jaret Burkett	dd7074a21f	Fix issue with layer offloading on ernie	2026-04-14 19:19:02 -06:00
Jaret Burkett (Ostris)	3e0c904054	Add support for Baidu's ERNIE-Image (#793 ) * Add support for ERNIE Image * change float64 to float32 * Version bump * Update ERNIE defaults	2026-04-14 09:45:12 -06:00
Jaret Burkett	233e292256	Added some experimental low step things for zeta	2026-04-13 09:37:34 -06:00
Jaret Burkett	9ca58e9aa2	Fixed offload and quantize order of ltx 2.3 text encoder.	2026-04-07 15:11:50 -06:00
M. Hofer	f213e3b1e5	Fix FLUX2 Klein load-time VRAM spikes on low-memory GPUs. (#726 ) Keep the transformer and Qwen text encoder off CUDA during initial load/quantization in low-VRAM mode so model startup avoids full-model OOM before offloading and quantization can take effect. Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: Jaret Burkett <jaretburkett@gmail.com>	2026-04-01 09:36:55 -06:00
Jaret Burkett	affa411edc	Fixed an issue where Flux.2 model VAE can be left offloaded to CPU when encoding control images while caching latents	2026-03-29 09:49:10 -06:00
abionda-sc	4ef5cbe5bc	Fixing bug where width and height are inverted for control image resizing (#707 )	2026-03-28 13:00:32 -06:00
科林 KELIN	489b194231	Fix CPU/CUDA device mismatch in Klein edit control image encoding (#742 ) When training Klein models with a `control_path` (edit/kontext-style paired datasets), `encode_image_refs()` returns tensors that reside on the VAE's device (CPU, since the VAE weights are loaded via `load_file(..., device="cpu")` and are never explicitly moved to the training device). Concatenating those CPU tensors with the training latents (`packed_latents`) that live on CUDA raises: RuntimeError: Expected all tensors to be on the same device Fix: move `img_cond_seq` and `img_cond_seq_ids` to the same device (and dtype) as `img_input` / `img_input_ids` before concatenation. Co-authored-by: HuangYuChuh <HuangYuChuh@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-25 11:45:38 -06:00
Jaret Burkett	dfde30f231	Fix issue with ltx2 custom te repo path	2026-03-25 09:50:18 -06:00
Jaret Burkett	7f3309b291	Add support for audo frame count so datasets can have varrying length videos. Varous ltx 2.3 VAE optimizations such as removing tiling articacts, and doing frame split encoding to reduce vram on encoding/decoding.	2026-03-24 12:20:09 -06:00
Rayane	99a4a5887b	Fix Qwen attention mask crash with diffusers >=0.37 (#748 ) * Fix Qwen Image mask handling * Fix Qwen attention mask crash with diffusers >=0.37 diffusers v0.37 (PR #12987) optimizes all-ones attention masks to None in encode_prompt() when there is no padding. This breaks ai-toolkit's Qwen extensions which call .to() on the mask unconditionally. Fix: reconstruct the all-ones mask at the boundary (get_prompt_embeds) right after encode_prompt() returns. This keeps the rest of the code unchanged and works with both old and new diffusers versions. Also removes redundant duplicate mask assignments in qwen_image_edit and qwen_image_edit_plus. Fixes #740	2026-03-23 14:43:08 -06:00
Jaret Burkett	295094b4b5	Fixed new breaking change in diffusers with with qwen image	2026-03-23 14:10:55 -06:00
Jaret Burkett	5642b656b9	Fix audio issues with ltx2 models. Silent codec fails now raised. Auto convert surround sound audio to stereo. Invalidate old caches just to be safe so they recache now.	2026-03-23 20:08:33 +00:00
Jaret Burkett	561e6f201c	Fixed an issue with ltx 2.3 i2v training	2026-03-23 12:41:18 -06:00
Jaret Burkett	e91827f9be	Change gemma repo to lightricks one that is not gated	2026-03-23 11:00:32 -06:00
Jaret Burkett	253cb31362	Fix issue with video and images with no audio on ltx models	2026-03-22 22:09:23 -06:00
Jaret Burkett	4a3d317e2b	Fix issue with using the default text encoder with ltx 2.3	2026-03-22 18:53:59 -06:00
Jaret Burkett	859635e95b	Add support for training LTX 2.3 (#745 ) * Initial support for ltx 2.3. Still needs a lot of testing to make sure it is all right. * bump version * Handle lora renaming keys for new ltx 2.3 layers	2026-03-22 17:56:59 -06:00
Jaret Burkett	57d407cfd4	Add support for training lodestones/Zeta-Chroma	2026-03-01 12:52:29 -07:00
Jaret Burkett	e82cf6eec2	Fixed issue that prevented full fine-tuning of flux2 models when using gradient checkpointing	2026-02-06 16:18:43 -07:00
Jaret Burkett	1ce2428722	Shrink text embeds to max token length for LTX-2. Drastically reduces cached text embedding sizes	2026-01-28 12:54:49 -07:00
Jaret Burkett	a6da9e37ac	Add support for FLUX.2 klein base models	2026-01-17 17:46:25 -07:00
Jaret Burkett	0efed794b4	Fix issue where flux2 would ignore single control image on training	2026-01-17 20:26:35 +00:00
Jaret Burkett	e40d7ac605	Ignore i2v on ltx is training on images	2026-01-14 18:46:27 -07:00
Jaret Burkett	9848de7946	Fix issue with ltx cached latents if there is no audio.	2026-01-14 17:27:01 -07:00
Jaret Burkett	73dedbf662	Do caching of latents, first frame and audio when caching latents for LTX2	2026-01-14 11:05:23 -07:00
Jaret Burkett	64fe29b182	Support img 2 vid training for ltx-2	2026-01-13 19:04:56 -07:00
Jaret Burkett	5b5aadadb8	Add LTX-2 Support (#644 ) * WIP, adding support for LTX2 * Training on images working * Fix loading comfy models * Handle converting and deconverting lora so it matches original format * Reworked ui to habdle ltx and propert dataset default overwriting. * Update the way lokr saves to it is more compatable with comfy * Audio loading and synchronization/resampling is working * Add audio to training. Does it work? Maybe, still testing. * Fixed fps default issue for sound * Have ui set fps for accurate audio mapping on ltx * Added audio procession options to the ui for ltx * Clean up requirements	2026-01-13 04:55:30 -07:00
Jaret Burkett	0d5c181843	Fixed issue where the control images would sometimes be ignored on qwen_image_edit_2511	2025-12-30 16:40:49 +00:00
Jaret Burkett	e6c5aead3b	Fix issue that prevented ramtorch layer offloading with z_image	2025-12-02 16:14:34 -07:00
Jaret Burkett	4e62c38df5	Add support for training Z-Image Turbo with a de-distill training adapter	2025-11-28 08:08:53 -07:00
Jaret Burkett	01cf480233	Add FLUX.2 official weights	2025-11-25 08:52:19 -07:00
Jaret Burkett	dadbeda197	Update test weights	2025-11-23 10:51:50 -07:00
Jaret Burkett	af8e9ea149	Add initial support for FLUX.2	2025-11-18 11:17:38 -07:00
Jaret Burkett	c984369294	Fixed resizing of control image resolution for Qwen Image Edit 2509 when using match_target_res	2025-10-30 06:30:01 -06:00
Jaret Burkett	ff14cd6343	Fix check for making sure vae is on the right device.	2025-10-21 14:49:20 -06:00
Jaret Burkett	76ce757e0c	Added initial support for layer offloading wit Wan 2.2 14B models.	2025-10-20 14:54:30 -06:00
Jaret Burkett	b7f85928f3	Fix issue with chroma when not quantizing	2025-10-19 12:13:05 -06:00
Jaret Burkett	0c9e1c3deb	Fixed some fringe cases for qwen image edit.	2025-10-13 17:10:46 +00:00
Jaret Burkett	e9c4d94256	Allow for matching target resolution with control images for Qwen Image Edit 2509	2025-10-10 14:24:27 -06:00
Jaret Burkett	1bc6dee127	Change auto_memory to be layer_offloading and allow you to set the amount to unload	2025-10-10 13:12:32 -06:00
Jaret Burkett	8068755b0a	Fixed issue with wan 2.2 getting stuck on CPU	2025-10-09 17:24:25 +00:00
Jaret Burkett	4e5707854f	Initial support for RamTorch. Still a WIP	2025-10-05 13:03:26 -06:00

1 2 3

108 Commits