Commit Graph

504 Commits

Author SHA1 Message Date
comfyanonymous
18927538a1 Implement NAG on all the models based on the Flux code. (#12500)
Use the Normalized Attention Guidance node.

Flux, Flux2, Klein, Chroma, Chroma radiance, Hunyuan Video, etc..
2026-02-16 23:30:34 -05:00
comfyanonymous
88e6370527 Remove workaround for old pytorch. (#12480) 2026-02-15 20:43:53 -05:00
krigeta
dc9822b7df Add working Qwen 2512 ControlNet (Fun ControlNet) support (#12359) 2026-02-13 22:23:52 -05:00
comfyanonymous
726af73867 Fix some custom nodes. (#12455) 2026-02-13 20:21:10 -05:00
comfyanonymous
e1add563f9 Use torch RMSNorm for flux models and refactor hunyuan video code. (#12432) 2026-02-13 15:35:13 -05:00
comfyanonymous
76a7fa96db Make built in lora training work on anima. (#12402) 2026-02-10 22:04:32 -05:00
Kohaku-Blueleaf
cdcf4119b3 [Trainer] training with proper offloading (#12189)
* Fix bypass dtype/device moving

* Force offloading mode for training

* training context var

* offloading implementation in training node

* fix wrong input type

* Support bypass load lora model, correct adapter/offloading handling
2026-02-10 21:45:19 -05:00
comfyanonymous
039955c527 Some fixes to previous pr. (#12339) 2026-02-06 20:14:52 -05:00
tdrussell
6a26328842 Support fp16 for Cosmos-Predict2 and Anima (#12249) 2026-02-06 20:12:15 -05:00
comfyanonymous
eba6c940fd Make ace step 1.5 base model work properly with default workflow. (#12337) 2026-02-06 19:14:56 -05:00
comfyanonymous
458292fef0 Fix some lowvram stuff with ace step 1.5 (#12312) 2026-02-05 19:15:04 -05:00
comfyanonymous
6555dc65b8 Make ace step 1.5 work without the llm. (#12311) 2026-02-05 16:43:45 -05:00
comfyanonymous
a50c32d63f Disable sage attention on ace step 1.5 (#12297) 2026-02-04 22:15:30 -05:00
comfyanonymous
6125b80979 Add llm sampling options and make reference audio work on ace step 1.5 (#12295) 2026-02-04 21:29:22 -05:00
comfyanonymous
3c1a1a2df8 Basic support for the ace step 1.5 model. (#12237) 2026-02-03 00:06:18 -05:00
rattus
f8acd9c402 Reduce RAM usage, fix VRAM OOMs, and fix Windows shared memory spilling with adaptive model loading (#11845) 2026-02-01 01:01:11 -05:00
comfyanonymous
b8f848bfe3 Fix model not working with any res. (#12186) 2026-01-31 00:12:48 -05:00
rattus
6516ab335d wan-vae: Switch off feature cache for single frame (#12090)
The code throughout is None safe to just skip the feature cache saving
step if none. Set it none in single frame use so qwen doesn't burn VRAM
on the unused cache.
2026-01-26 19:40:19 -05:00
comfyanonymous
635406e283 Only enable fp16 on z image models that actually support it. (#12065) 2026-01-24 22:32:28 -05:00
rattus
4e6a1b66a9 speed up and reduce VRAM of QWEN VAE and WAN (less so) (#12036)
* ops: introduce autopad for conv3d

This works around pytorch missing ability to causal pad as part of the
kernel and avoids massive weight duplications for padding.

* wan-vae: rework causal padding

This currently uses F.pad which takes a full deep copy and is liable to
be the VRAM peak. Instead, kick spatial padding back to the op and
consolidate the temporal padding with the cat for the cache.

* wan-vae: implement zero pad fast path

The WAN VAE is also QWEN where it is used single-image. These
convolutions are however zero padded 3d convolutions, which means the
VAE is actually just 2D down the last element of the conv weight in
the temporal dimension. Fast path this, to avoid adding zeros that
then just evaporate in convoluton math but cost computation.
2026-01-23 19:56:14 -05:00
Jukka Seppänen
55bd606e92 LTX2: Refactor forward function for better VRAM efficiency and fix spatial inpainting (#12046)
* Disable timestep embed compression when inpainting

Spatial inpainting not compatible with the compression

* Reduce crossattn peak VRAM

* LTX2: Refactor forward function for better VRAM efficiency
2026-01-23 15:26:38 -05:00
Omri Marom
d7f3241bf6 qwen_image: propagate attention mask. (#11966) 2026-01-22 20:02:31 -05:00
rattus
0fd1b78736 Reduce LTX2 VAE VRAM consumption (#12028)
* causal_video_ae: Remove attention ResNet

This attention_head_dim argument does not exist on this constructor so
this is dead code. Remove as generic attention mid VAE conflicts with
temporal roll.

* ltx-vae: consoldate causal/non-causal code paths

* ltx-vae: add cache rolling adder

* ltx-vae: use cached adder for resnet

* ltx-vae: Implement rolling VAE

Implement a temporal rolling VAE for the LTX2 VAE.

Usually when doing temporal rolling VAEs you can just chunk on time relying
on causality and cache behind you as you go. The LTX VAE is however
non-causal.

So go whole hog and implement per layer run ahead and backpressure between
the decoder layers using recursive state beween the layers.

Operations are ammended with temporal_cache_state{} which they can use to
hold any state then need for partial execution. Convolutions cache their
inputs behind the up to N-1 frames, and skip connections need to cache the
mismatch between convolution input and output that happens due to missing
future (non-causal) input.

Each call to run_up() processes a layer accross a range on input that
may or may not be complete. It goes depth first to process as much as
possible to try and digest frames to the final output ASAP. If layers run
out of input due to convolution losses, they simply return without action
effectively applying back-pressure to the earlier layers. As the earlier
layers do more work and caller deeper, the partial states are reconciled
and output continues to digest depth first as much as possible.

Chunking is done using a size quota rather than a fixed frame length and
any layer can initiate chunking, and multiple layers can chunk at different
granulatiries. This remove the old limitation of always having to process
1 latent frame to entirety and having to hold 8 full decoded frames as
the VRAM peak.
2026-01-22 16:54:18 -05:00
Jukka Seppänen
16b9aabd52 Support Multi/InfiniteTalk (#10179)
* re-init

* Update model_multitalk.py

* whitespace...

* Update model_multitalk.py

* remove print

* this is redundant

* remove import

* Restore preview functionality

* Move block_idx to transformer_options

* Remove LoopingSamplerCustomAdvanced

* Remove looping functionality, keep extension functionality

* Update model_multitalk.py

* Handle ref_attn_mask with separate patch to avoid having to always return q and k from self_attn

* Chunk attention map calculation for multiple speakers to reduce peak VRAM usage

* Update model_multitalk.py

* Add ModelPatch type back

* Fix for latest upstream

* Use DynamicCombo for cleaner node

Basically just so that single_speaker mode hides mask inputs and 2nd audio input

* Update nodes_wan.py
2026-01-21 23:09:48 -05:00
comfyanonymous
abe2ec26a6 Support the Anima model. (#12012) 2026-01-21 19:44:28 -05:00
Ivan Zorin
965d0ed509 fix: remove normalization of audio in LTX Mel spectrogram creation (#11990)
For LTX Audio VAE, remove normalization of audio during MEL spectrogram creation.
This aligs inference with training and prevents loud audio from being attenuated.
2026-01-20 18:44:28 -05:00
comfyanonymous
8ccc0c94fa Make omni stuff work on regular z image for easier testing. (#11985) 2026-01-20 00:32:00 -05:00
comfyanonymous
2108167f9f Support zimage omni base model. (#11979) 2026-01-19 23:17:38 -05:00
rkfg
0da5a0fe58 Convert mono audio to fake stereo for LTXV VAE encoding (#11965) 2026-01-19 22:12:02 -05:00
Jukka Seppänen
fd5c0755af Reduce LTX2 VRAM use by more efficient timestep embed handling (#11829) 2026-01-12 17:28:59 -05:00
comfyanonymous
1a20656448 Fix import issue. (#11746) 2026-01-08 17:23:59 -05:00
comfyanonymous
023cf13721 Fix lowvram issue with ltxv2 text encoder. (#11675) 2026-01-06 17:33:03 -05:00
comfyanonymous
c3c3e93c5b Use rope functions from comfy kitchen. (#11674) 2026-01-06 16:57:50 -05:00
comfyanonymous
1618002411 Revert "Use rope functions from comfy kitchen. (#11647)" (#11648)
This reverts commit 6ef85c4915.
2026-01-05 23:07:39 -05:00
comfyanonymous
6ef85c4915 Use rope functions from comfy kitchen. (#11647) 2026-01-05 22:50:35 -05:00
comfyanonymous
f2b002372b Support the LTXV 2 model. (#11632) 2026-01-05 01:58:59 -05:00
comfyanonymous
65cfcf5b1b New Year ruff cleanup. (#11595) 2026-01-01 22:06:14 -05:00
mengqin
0357ed7ec4 Add support for sage attention 3 in comfyui, enable via new cli arg (#11026)
* Add support for sage attention 3 in comfyui, enable via new cli arg
--use-sage-attiention3

* Fix some bugs found in PR review. The N dimension at which Sage
Attention 3 takes effect is reduced to 1024 (although the improvement is
not significant at this scale).

* Remove the Sage Attention3 switch, but retain the attention function
registration.

* Fix a ruff check issue in attention.py
2025-12-30 22:53:52 -05:00
comfyanonymous
8fd07170f1 Comment out unused norm_final in lumina/z image model. (#11545) 2025-12-28 22:07:25 -05:00
comfyanonymous
31e961736a Fix issue with batches and newbie. (#11435) 2025-12-20 00:23:51 -05:00
comfyanonymous
28eaab608b Diffusion model part of Qwen Image Layered. (#11408)
Only thing missing after this is some nodes to make using it easier.
2025-12-18 20:21:14 -05:00
comfyanonymous
e4fb3a3572 Support loading Wan/Qwen VAEs with different in/out channels. (#11405) 2025-12-18 17:45:33 -05:00
comfyanonymous
ffdd53b327 Check state dict key to auto enable the index_timestep_zero ref method. (#11362) 2025-12-16 17:03:17 -05:00
comfyanonymous
bc606d7d64 Add a way to set the default ref method in the qwen image code. (#11349) 2025-12-16 01:26:55 -05:00
Haoming
ea2c117bc3 [BlockInfo] Wan (#10845)
* block info

* animate

* tensor

* device

* revert
2025-12-15 17:59:16 -08:00
Haoming
fc4af86068 [BlockInfo] Lumina (#11227)
* block info

* device

* Make tensor int again

---------

Co-authored-by: Jedrzej Kosinski <kosinkadink1@gmail.com>
2025-12-15 17:57:28 -08:00
comfyanonymous
70541d4e77 Support the new qwen edit 2511 reference method. (#11340)
index_timestep_zero can be selected in the
FluxKontextMultiReferenceLatentMethod now with the display name set to the
more generic "Edit Model Reference Method" node.
2025-12-15 19:20:34 -05:00
comfyanonymous
da2bfb5b0a Basic implementation of z image fun control union 2.0 (#11304)
The inpaint part is currently missing and will be implemented later.

I think they messed up this model pretty bad. They added some
control_noise_refiner blocks but don't actually use them. There is a typo
in their code so instead of doing control_noise_refiner -> control_layers
it runs the whole control_layers twice.

Unfortunately they trained with this typo so the model works but is kind
of slow and would probably perform a lot better if they corrected their
code and trained it again.
2025-12-13 01:39:11 -05:00
Jukka Seppänen
e2a800e7ef Fix for HunyuanVideo1.5 meanflow distil (#11212) 2025-12-09 16:59:16 -05:00
Lodestone
b9fb542703 add chroma-radiance-x0 mode (#11197) 2025-12-08 23:33:29 -05:00