108 Commits

Author SHA1 Message Date
Jaret Burkett
e9ab387dfd Fixed issue with qwen image edit models when using multiple control images when not caching text embeddings. 2026-04-30 11:59:12 +00:00
Jaret Burkett
deb409085a Made it possible to use the flux 2 small decoder VAE when setting the vae_path manually for flux2 models 2026-04-30 05:42:31 -06:00
Jaret Burkett
7ccec8ec2c Add checkpointing and a proper decode for flux 2 VAEs so they can be used with DFE 2026-04-30 04:27:13 -06:00
Jaret Burkett
7c4f18ce51 Fix ernie unpatchify 2026-04-17 12:03:39 -06:00
Jaret Burkett
8cb9649382 Add decode latent to ernie pipe 2026-04-17 12:01:09 -06:00
Jaret Burkett
ab1ee4df34 Hotfix some issues with Wan models caused by diffusers and transformers updates 2026-04-16 20:53:50 +00:00
Jaret Burkett
afb62b1fa5 Add support for Nucleus-Image 2026-04-16 13:09:10 -06:00
Jaret Burkett
dd7074a21f Fix issue with layer offloading on ernie 2026-04-14 19:19:02 -06:00
Jaret Burkett (Ostris)
3e0c904054 Add support for Baidu's ERNIE-Image (#793)
* Add support for ERNIE Image

* change float64 to float32

* Version bump

* Update ERNIE defaults
2026-04-14 09:45:12 -06:00
Jaret Burkett
233e292256 Added some experimental low step things for zeta 2026-04-13 09:37:34 -06:00
Jaret Burkett
9ca58e9aa2 Fixed offload and quantize order of ltx 2.3 text encoder. 2026-04-07 15:11:50 -06:00
M. Hofer
f213e3b1e5 Fix FLUX2 Klein load-time VRAM spikes on low-memory GPUs. (#726)
Keep the transformer and Qwen text encoder off CUDA during initial load/quantization in low-VRAM mode so model startup avoids full-model OOM before offloading and quantization can take effect.

Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Jaret Burkett <jaretburkett@gmail.com>
2026-04-01 09:36:55 -06:00
Jaret Burkett
affa411edc Fixed an issue where Flux.2 model VAE can be left offloaded to CPU when encoding control images while caching latents 2026-03-29 09:49:10 -06:00
abionda-sc
4ef5cbe5bc Fixing bug where width and height are inverted for control image resizing (#707) 2026-03-28 13:00:32 -06:00
科林 KELIN
489b194231 Fix CPU/CUDA device mismatch in Klein edit control image encoding (#742)
When training Klein models with a `control_path` (edit/kontext-style
paired datasets), `encode_image_refs()` returns tensors that reside on
the VAE's device (CPU, since the VAE weights are loaded via
`load_file(..., device="cpu")` and are never explicitly moved to the
training device).  Concatenating those CPU tensors with the training
latents (`packed_latents`) that live on CUDA raises:

    RuntimeError: Expected all tensors to be on the same device

Fix: move `img_cond_seq` and `img_cond_seq_ids` to the same device
(and dtype) as `img_input` / `img_input_ids` before concatenation.

Co-authored-by: HuangYuChuh <HuangYuChuh@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-25 11:45:38 -06:00
Jaret Burkett
dfde30f231 Fix issue with ltx2 custom te repo path 2026-03-25 09:50:18 -06:00
Jaret Burkett
7f3309b291 Add support for audo frame count so datasets can have varrying length videos. Varous ltx 2.3 VAE optimizations such as removing tiling articacts, and doing frame split encoding to reduce vram on encoding/decoding. 2026-03-24 12:20:09 -06:00
Rayane
99a4a5887b Fix Qwen attention mask crash with diffusers >=0.37 (#748)
* Fix Qwen Image mask handling

* Fix Qwen attention mask crash with diffusers >=0.37

diffusers v0.37 (PR #12987) optimizes all-ones attention masks to None
in encode_prompt() when there is no padding. This breaks ai-toolkit's
Qwen extensions which call .to() on the mask unconditionally.

Fix: reconstruct the all-ones mask at the boundary (get_prompt_embeds)
right after encode_prompt() returns. This keeps the rest of the code
unchanged and works with both old and new diffusers versions.

Also removes redundant duplicate mask assignments in qwen_image_edit
and qwen_image_edit_plus.

Fixes #740
2026-03-23 14:43:08 -06:00
Jaret Burkett
295094b4b5 Fixed new breaking change in diffusers with with qwen image 2026-03-23 14:10:55 -06:00
Jaret Burkett
5642b656b9 Fix audio issues with ltx2 models. Silent codec fails now raised. Auto convert surround sound audio to stereo. Invalidate old caches just to be safe so they recache now. 2026-03-23 20:08:33 +00:00
Jaret Burkett
561e6f201c Fixed an issue with ltx 2.3 i2v training 2026-03-23 12:41:18 -06:00
Jaret Burkett
e91827f9be Change gemma repo to lightricks one that is not gated 2026-03-23 11:00:32 -06:00
Jaret Burkett
253cb31362 Fix issue with video and images with no audio on ltx models 2026-03-22 22:09:23 -06:00
Jaret Burkett
4a3d317e2b Fix issue with using the default text encoder with ltx 2.3 2026-03-22 18:53:59 -06:00
Jaret Burkett
859635e95b Add support for training LTX 2.3 (#745)
* Initial support for ltx 2.3. Still needs a lot of testing to make sure it is all right.

* bump version

* Handle lora renaming keys for new ltx 2.3 layers
2026-03-22 17:56:59 -06:00
Jaret Burkett
57d407cfd4 Add support for training lodestones/Zeta-Chroma 2026-03-01 12:52:29 -07:00
Jaret Burkett
e82cf6eec2 Fixed issue that prevented full fine-tuning of flux2 models when using gradient checkpointing 2026-02-06 16:18:43 -07:00
Jaret Burkett
1ce2428722 Shrink text embeds to max token length for LTX-2. Drastically reduces cached text embedding sizes 2026-01-28 12:54:49 -07:00
Jaret Burkett
a6da9e37ac Add support for FLUX.2 klein base models 2026-01-17 17:46:25 -07:00
Jaret Burkett
0efed794b4 Fix issue where flux2 would ignore single control image on training 2026-01-17 20:26:35 +00:00
Jaret Burkett
e40d7ac605 Ignore i2v on ltx is training on images 2026-01-14 18:46:27 -07:00
Jaret Burkett
9848de7946 Fix issue with ltx cached latents if there is no audio. 2026-01-14 17:27:01 -07:00
Jaret Burkett
73dedbf662 Do caching of latents, first frame and audio when caching latents for LTX2 2026-01-14 11:05:23 -07:00
Jaret Burkett
64fe29b182 Support img 2 vid training for ltx-2 2026-01-13 19:04:56 -07:00
Jaret Burkett
5b5aadadb8 Add LTX-2 Support (#644)
* WIP, adding support for LTX2

* Training on images working

* Fix loading comfy models

* Handle converting and deconverting lora so it matches original format

* Reworked ui to habdle ltx and propert dataset default overwriting.

* Update the way lokr saves to it is more compatable with comfy

* Audio loading and synchronization/resampling is working

* Add audio to training. Does it work? Maybe, still testing.

* Fixed fps default issue for sound

* Have ui set fps for accurate audio mapping on ltx

* Added audio procession options to the ui for ltx

* Clean up requirements
2026-01-13 04:55:30 -07:00
Jaret Burkett
0d5c181843 Fixed issue where the control images would sometimes be ignored on qwen_image_edit_2511 2025-12-30 16:40:49 +00:00
Jaret Burkett
e6c5aead3b Fix issue that prevented ramtorch layer offloading with z_image 2025-12-02 16:14:34 -07:00
Jaret Burkett
4e62c38df5 Add support for training Z-Image Turbo with a de-distill training adapter 2025-11-28 08:08:53 -07:00
Jaret Burkett
01cf480233 Add FLUX.2 official weights 2025-11-25 08:52:19 -07:00
Jaret Burkett
dadbeda197 Update test weights 2025-11-23 10:51:50 -07:00
Jaret Burkett
af8e9ea149 Add initial support for FLUX.2 2025-11-18 11:17:38 -07:00
Jaret Burkett
c984369294 Fixed resizing of control image resolution for Qwen Image Edit 2509 when using match_target_res 2025-10-30 06:30:01 -06:00
Jaret Burkett
ff14cd6343 Fix check for making sure vae is on the right device. 2025-10-21 14:49:20 -06:00
Jaret Burkett
76ce757e0c Added initial support for layer offloading wit Wan 2.2 14B models. 2025-10-20 14:54:30 -06:00
Jaret Burkett
b7f85928f3 Fix issue with chroma when not quantizing 2025-10-19 12:13:05 -06:00
Jaret Burkett
0c9e1c3deb Fixed some fringe cases for qwen image edit. 2025-10-13 17:10:46 +00:00
Jaret Burkett
e9c4d94256 Allow for matching target resolution with control images for Qwen Image Edit 2509 2025-10-10 14:24:27 -06:00
Jaret Burkett
1bc6dee127 Change auto_memory to be layer_offloading and allow you to set the amount to unload 2025-10-10 13:12:32 -06:00
Jaret Burkett
8068755b0a Fixed issue with wan 2.2 getting stuck on CPU 2025-10-09 17:24:25 +00:00
Jaret Burkett
4e5707854f Initial support for RamTorch. Still a WIP 2025-10-05 13:03:26 -06:00