Commit Graph

96 Commits

Author SHA1 Message Date
Jaret Burkett
affa411edc Fixed an issue where Flux.2 model VAE can be left offloaded to CPU when encoding control images while caching latents 2026-03-29 09:49:10 -06:00
abionda-sc
4ef5cbe5bc Fixing bug where width and height are inverted for control image resizing (#707) 2026-03-28 13:00:32 -06:00
科林 KELIN
489b194231 Fix CPU/CUDA device mismatch in Klein edit control image encoding (#742)
When training Klein models with a `control_path` (edit/kontext-style
paired datasets), `encode_image_refs()` returns tensors that reside on
the VAE's device (CPU, since the VAE weights are loaded via
`load_file(..., device="cpu")` and are never explicitly moved to the
training device).  Concatenating those CPU tensors with the training
latents (`packed_latents`) that live on CUDA raises:

    RuntimeError: Expected all tensors to be on the same device

Fix: move `img_cond_seq` and `img_cond_seq_ids` to the same device
(and dtype) as `img_input` / `img_input_ids` before concatenation.

Co-authored-by: HuangYuChuh <HuangYuChuh@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-25 11:45:38 -06:00
Jaret Burkett
dfde30f231 Fix issue with ltx2 custom te repo path 2026-03-25 09:50:18 -06:00
Jaret Burkett
7f3309b291 Add support for audo frame count so datasets can have varrying length videos. Varous ltx 2.3 VAE optimizations such as removing tiling articacts, and doing frame split encoding to reduce vram on encoding/decoding. 2026-03-24 12:20:09 -06:00
Rayane
99a4a5887b Fix Qwen attention mask crash with diffusers >=0.37 (#748)
* Fix Qwen Image mask handling

* Fix Qwen attention mask crash with diffusers >=0.37

diffusers v0.37 (PR #12987) optimizes all-ones attention masks to None
in encode_prompt() when there is no padding. This breaks ai-toolkit's
Qwen extensions which call .to() on the mask unconditionally.

Fix: reconstruct the all-ones mask at the boundary (get_prompt_embeds)
right after encode_prompt() returns. This keeps the rest of the code
unchanged and works with both old and new diffusers versions.

Also removes redundant duplicate mask assignments in qwen_image_edit
and qwen_image_edit_plus.

Fixes #740
2026-03-23 14:43:08 -06:00
Jaret Burkett
295094b4b5 Fixed new breaking change in diffusers with with qwen image 2026-03-23 14:10:55 -06:00
Jaret Burkett
5642b656b9 Fix audio issues with ltx2 models. Silent codec fails now raised. Auto convert surround sound audio to stereo. Invalidate old caches just to be safe so they recache now. 2026-03-23 20:08:33 +00:00
Jaret Burkett
561e6f201c Fixed an issue with ltx 2.3 i2v training 2026-03-23 12:41:18 -06:00
Jaret Burkett
e91827f9be Change gemma repo to lightricks one that is not gated 2026-03-23 11:00:32 -06:00
Jaret Burkett
253cb31362 Fix issue with video and images with no audio on ltx models 2026-03-22 22:09:23 -06:00
Jaret Burkett
4a3d317e2b Fix issue with using the default text encoder with ltx 2.3 2026-03-22 18:53:59 -06:00
Jaret Burkett
859635e95b Add support for training LTX 2.3 (#745)
* Initial support for ltx 2.3. Still needs a lot of testing to make sure it is all right.

* bump version

* Handle lora renaming keys for new ltx 2.3 layers
2026-03-22 17:56:59 -06:00
Jaret Burkett
57d407cfd4 Add support for training lodestones/Zeta-Chroma 2026-03-01 12:52:29 -07:00
Jaret Burkett
e82cf6eec2 Fixed issue that prevented full fine-tuning of flux2 models when using gradient checkpointing 2026-02-06 16:18:43 -07:00
Jaret Burkett
1ce2428722 Shrink text embeds to max token length for LTX-2. Drastically reduces cached text embedding sizes 2026-01-28 12:54:49 -07:00
Jaret Burkett
a6da9e37ac Add support for FLUX.2 klein base models 2026-01-17 17:46:25 -07:00
Jaret Burkett
0efed794b4 Fix issue where flux2 would ignore single control image on training 2026-01-17 20:26:35 +00:00
Jaret Burkett
e40d7ac605 Ignore i2v on ltx is training on images 2026-01-14 18:46:27 -07:00
Jaret Burkett
9848de7946 Fix issue with ltx cached latents if there is no audio. 2026-01-14 17:27:01 -07:00
Jaret Burkett
73dedbf662 Do caching of latents, first frame and audio when caching latents for LTX2 2026-01-14 11:05:23 -07:00
Jaret Burkett
64fe29b182 Support img 2 vid training for ltx-2 2026-01-13 19:04:56 -07:00
Jaret Burkett
5b5aadadb8 Add LTX-2 Support (#644)
* WIP, adding support for LTX2

* Training on images working

* Fix loading comfy models

* Handle converting and deconverting lora so it matches original format

* Reworked ui to habdle ltx and propert dataset default overwriting.

* Update the way lokr saves to it is more compatable with comfy

* Audio loading and synchronization/resampling is working

* Add audio to training. Does it work? Maybe, still testing.

* Fixed fps default issue for sound

* Have ui set fps for accurate audio mapping on ltx

* Added audio procession options to the ui for ltx

* Clean up requirements
2026-01-13 04:55:30 -07:00
Jaret Burkett
0d5c181843 Fixed issue where the control images would sometimes be ignored on qwen_image_edit_2511 2025-12-30 16:40:49 +00:00
Jaret Burkett
e6c5aead3b Fix issue that prevented ramtorch layer offloading with z_image 2025-12-02 16:14:34 -07:00
Jaret Burkett
4e62c38df5 Add support for training Z-Image Turbo with a de-distill training adapter 2025-11-28 08:08:53 -07:00
Jaret Burkett
01cf480233 Add FLUX.2 official weights 2025-11-25 08:52:19 -07:00
Jaret Burkett
dadbeda197 Update test weights 2025-11-23 10:51:50 -07:00
Jaret Burkett
af8e9ea149 Add initial support for FLUX.2 2025-11-18 11:17:38 -07:00
Jaret Burkett
c984369294 Fixed resizing of control image resolution for Qwen Image Edit 2509 when using match_target_res 2025-10-30 06:30:01 -06:00
Jaret Burkett
ff14cd6343 Fix check for making sure vae is on the right device. 2025-10-21 14:49:20 -06:00
Jaret Burkett
76ce757e0c Added initial support for layer offloading wit Wan 2.2 14B models. 2025-10-20 14:54:30 -06:00
Jaret Burkett
b7f85928f3 Fix issue with chroma when not quantizing 2025-10-19 12:13:05 -06:00
Jaret Burkett
0c9e1c3deb Fixed some fringe cases for qwen image edit. 2025-10-13 17:10:46 +00:00
Jaret Burkett
e9c4d94256 Allow for matching target resolution with control images for Qwen Image Edit 2509 2025-10-10 14:24:27 -06:00
Jaret Burkett
1bc6dee127 Change auto_memory to be layer_offloading and allow you to set the amount to unload 2025-10-10 13:12:32 -06:00
Jaret Burkett
8068755b0a Fixed issue with wan 2.2 getting stuck on CPU 2025-10-09 17:24:25 +00:00
Jaret Burkett
4e5707854f Initial support for RamTorch. Still a WIP 2025-10-05 13:03:26 -06:00
Jaret Burkett
b7c04efb44 A commit with the adits properly named improvements to qwen image edit plus workflow. Fixed a bug. Dont norm the cfg 2025-10-01 14:13:15 -06:00
Jaret Burkett
3086a58e5b git status 2025-10-01 14:12:17 -06:00
Jaret Burkett
67ed563e03 fix issue with multi batch size on qwen-image-edit-plus 2025-09-30 09:04:56 -06:00
Jaret Burkett
6da417261c Add extra detachments just to be sure on qiep 2025-09-27 08:53:59 -06:00
Jaret Burkett
e04f55c553 Fixed scaling issue with control images 2025-09-26 11:49:53 -06:00
Jaret Burkett
454be0958a Initial support for qwen image edit plus 2025-09-24 11:39:10 -06:00
Jaret Burkett
20dfe1b4d5 Small double tap of detach on qwen just for good measure 2025-09-18 08:22:04 -06:00
Jaret Burkett
b95c17dc17 Add initial support for chroma radiance 2025-09-10 08:41:05 -06:00
Jaret Burkett
056711d4ed Fix issue with wan22 14b that woudl load both transformers temporarily resulting in oom on 24GB. 2025-08-28 13:06:31 -06:00
Jaret Burkett
119653c3f2 Force width, height, and num frames to always be the proper sizes for Wan 2.2 models 2025-08-25 10:33:28 -06:00
Jaret Burkett
e1fd411665 Added support for Chroma1 official release. Will still use single file verstion instead of the diffusers version. 2025-08-23 09:06:28 -06:00
Jaret Burkett
bf2700f7be Initial support for finetuning qwen image. Will only work with caching for now, need to add controls everywhere. 2025-08-21 16:41:17 -06:00