Compare commits

..

53 Commits

Author SHA1 Message Date
bymyself
13a4008735 refactor: simplify redundant code in video encoding
- Remove redundant pix_fmt assignment (set once instead of in both branches)
- Remove unnecessary VideoContainer.AUTO check (format already normalized)
2026-01-20 12:55:34 -08:00
bymyself
bb7c582b58 fix: add fail-fast validation for unsupported codecs
Replace silent fallback to libx264 with explicit validation that raises
ValueError for unknown codec types. Prevents silent codec substitution.
2026-01-20 12:55:17 -08:00
bymyself
dcbe072e8a fix: add error handling for invalid speed preset values
Wrap VideoSpeedPreset conversion in try/except to gracefully handle
invalid speed_str values instead of raising unhandled ValueError.
Logs warning and falls back to None (default) on failure.
2026-01-19 21:17:02 -08:00
bymyself
86317e32b8 feat: enhance SaveVideo with DynamicCombo for codec-specific options
Major changes:
- SaveVideo now uses DynamicCombo to show codec-specific encoding options
- When selecting h264 or vp9, quality and speed inputs appear dynamically
- Each codec has unique ADVANCED options (shown in 'Show Advanced' section):
  - h264: Profile (baseline/main/high), Tune (film/animation/grain/etc.)
  - vp9: Row Multi-threading, Tile Columns
- Quality maps to CRF internally with codec-appropriate ranges
- Speed presets map to FFmpeg presets

New types:
- VideoSpeedPreset enum with user-friendly names
- quality_to_crf() helper function

This demonstrates both DynamicCombo (codec-specific inputs) and advanced
widgets (profile/tune/row_mt/tile_columns hidden by default).

Amp-Thread-ID: https://ampcode.com/threads/T-019bce44-e743-7349-8bad-d3027d456fb1
Co-authored-by: Amp <amp@ampcode.com>
2026-01-17 17:09:19 -08:00
bymyself
36b9673a33 feat: add advanced parameter to Input classes for advanced widgets support
Add 'advanced' boolean parameter to Input and WidgetInput base classes
and propagate to all typed Input subclasses (Boolean, Int, Float, String,
Combo, MultiCombo, Webcam, MultiType, MatchType, ImageCompare).

When set to True, the frontend will hide these inputs by default in a
collapsible 'Advanced Inputs' section in the right side panel, reducing
visual clutter for power-user options.

This enables nodes to expose advanced configuration options (like encoding
parameters, quality settings, etc.) without overwhelming typical users.

Frontend support: ComfyUI_frontend PR #7812
2026-01-17 16:13:16 -08:00
comfyanonymous
7ac999bf30 Add image sizes to clip vision outputs. (#11923) 2026-01-16 23:02:28 -05:00
ComfyUI Wiki
0c6b36c6ac chore: update workflow templates to v0.8.11 (#11918) 2026-01-16 17:22:50 -05:00
Alexander Piskun
9125613b53 feat(api-nodes): extend ByteDance nodes with seedance-1-5-pro model (#11871) 2026-01-15 22:09:07 -08:00
Jedrzej Kosinski
732b707397 Added try-except around seed_assets call in get_object_info with a logging statement (#11901) 2026-01-15 23:15:15 -05:00
comfyanonymous
4c816d5c69 Adjust memory usage factor calculation for flux2 klein. (#11900) 2026-01-15 20:06:40 -05:00
ComfyUI Wiki
6125b3a5e7 Update workflow templates to v0.8.10 (#11899)
* chore: update workflow templates to v0.8.9

* Update requirements.txt
2026-01-15 13:12:13 -08:00
ComfyUI Wiki
12918a5f78 chore: update workflow templates to v0.8.7 (#11896) 2026-01-15 11:08:21 -08:00
comfyanonymous
8f40b43e02 ComfyUI v0.9.2 2026-01-15 10:57:35 -05:00
comfyanonymous
3b832231bb Flux2 Klein support. (#11890) 2026-01-15 10:33:15 -05:00
Jukka Seppänen
be518db5a7 Remove extraneous clip missing warnings when loading LTX2 embeddings_connector weights (#11874) 2026-01-14 17:54:04 -05:00
rattus
80441eb15e utils: fix lanczos grayscale upscaling (#11873) 2026-01-14 17:53:16 -05:00
Alexander Piskun
07f2462eae feat(api-nodes): add Meshy 3D nodes (#11843)
* feat(api-nodes): add Meshy 3D nodes

* rebased, added JSONata price badges
2026-01-14 11:25:38 -08:00
comfyanonymous
d150440466 Fix VAELoader (#11880) 2026-01-14 10:54:50 -08:00
comfyanonymous
6165c38cb5 Optimize nvfp4 lora applying. (#11866)
This changes results a bit but it also speeds up things a lot.
2026-01-14 00:49:38 -05:00
Silver
712cca36a1 feat: throttle ProgressBar updates to reduce WebSocket flooding (#11504) 2026-01-13 22:41:44 -05:00
Johnpaul Chiwetelu
ac4d8ea9b3 feat: add CI container version bump automation (#11692)
* feat: add CI container version bump automation

Adds a workflow that triggers on releases to create PRs in the
comfyui-ci-container repo, updating the ComfyUI version in the Dockerfile.

Supports both release events and manual workflow dispatch for testing.

* feat: add CI container version bump automation

Adds a workflow that triggers on releases to create PRs in the
comfyui-ci-container repo, updating the ComfyUI version in the Dockerfile.

Supports both release events and manual workflow dispatch for testing.

* ci: update CI container repository owner

* refactor: rename `update-ci-container.yaml` workflow to `update-ci-container.yml`

* Remove post-merge instructions from the CI container update workflow.
2026-01-13 22:39:22 -05:00
nomadoor
c9196f355e Fix scale_shorter_dimension portrait check (#11862) 2026-01-13 18:25:09 -08:00
Christian Byrne
7eb959ce93 fix: update ComfyUI repo reference to Comfy-Org/ComfyUI (#11858) 2026-01-13 21:03:16 -05:00
nomadoor
469dd9c16a Adds crop to multiple mode to ResizeImageMaskNode. (#11838)
* Add crop-to-multiple resize mode

* Make scale-to-multiple shape handling explicit
2026-01-13 16:48:10 -08:00
comfyanonymous
eff2b9d412 Optimize nvfp4 lora applying. (#11856) 2026-01-13 19:37:19 -05:00
comfyanonymous
15b312de7a Optimize nvfp4 lora applying. (#11854) 2026-01-13 19:23:58 -05:00
Alexander Piskun
1419047fdb [Api Nodes]: Improve Price Badge Declarations (#11582)
* api nodes: price badges moved to nodes code

* added price badges for 4 more node-packs

* added price badges for 10 more node-packs

* added new price badges for Omni STD mode

* add support for autogrow groups

* use full names for "widgets", "inputs" and "groups"

* add strict typing for JSONata rules

* add price badge for WanReferenceVideoApi node

* add support for DynamicCombo

* sync price badges changes (https://github.com/Comfy-Org/ComfyUI_frontend/pull/7900)

* sync badges for Vidu2 nodes

* fixed incorrect price for RecraftCrispUpscaleNode

* fixed incorrect price badges for LTXV nodes

* fixed price badge for MinimaxHailuoVideoNode

* fixed price badges for PixVerse nodes
2026-01-13 16:18:28 -08:00
ric-yu
79f6bb5e4f add blueprints dir for built-in blueprints (#11853) 2026-01-13 16:14:40 -08:00
Jukka Seppänen
e4b4fb3479 Load metadata on VAELoader (#11846)
Needed to load the proper LTX2 VAE if separated from checkpoint
2026-01-13 17:37:21 -05:00
Acly
d9dc02a7d6 Support "lite" version of alibaba-pai Z-Image Controlnet (#11849)
* reduced number of control layers (3) compared to full model
2026-01-13 15:03:53 -05:00
Alexander Piskun
c543ad81c3 fix(api-nodes-gemini): raise exception when no candidates due to safety block (#11848) 2026-01-13 08:30:13 -08:00
comfyanonymous
5ac1372533 ComfyUI v0.9.1 2026-01-13 01:44:06 -05:00
comfyanonymous
1dcbd9efaf Bump ltxav mem estimation a bit. (#11842) 2026-01-13 01:42:07 -05:00
comfyanonymous
db9e6edfa1 ComfyUI v0.9.0 2026-01-13 01:23:31 -05:00
Christian Byrne
8af13b439b Update requirements.txt (#11841) 2026-01-13 01:22:25 -05:00
Jedrzej Kosinski
acd0e53653 Make bulk_ops not use .returning to be compatible with python 3.10 and 3.11 sqlalchemy (#11839) 2026-01-13 00:15:24 -05:00
comfyanonymous
117e7a5853 Refactor to try to lower mem usage. (#11840) 2026-01-12 21:01:52 -08:00
comfyanonymous
b3c0e4de57 Make loras work on nvfp4 models. (#11837)
The initial applying is a bit slow but will probably be sped up in the
future.
2026-01-12 22:33:54 -05:00
ComfyUI Wiki
ecaeeb990d chore: update workflow templates to v0.8.4 (#11835) 2026-01-12 19:18:01 -08:00
ComfyUI Wiki
c2b65e2fce Update workflow templates to v0.8.0 (#11828) 2026-01-12 17:29:25 -05:00
Jukka Seppänen
fd5c0755af Reduce LTX2 VRAM use by more efficient timestep embed handling (#11829) 2026-01-12 17:28:59 -05:00
comfyanonymous
c881a1d689 Support the siglip 2 naflex model as a clip vision model. (#11831)
Not useful yet.
2026-01-12 17:05:54 -05:00
kelseyee
a3b5d4996a Support ModelScope-Trainer DiffSynth lora for Z Image. (#11805) 2026-01-12 15:38:46 -05:00
comfyanonymous
c6238047ee Put more details about portable in readme. (#11816) 2026-01-11 21:11:53 -05:00
Alexander Piskun
5cd1113236 fix(api-nodes): use a unique name for uploading audio files (#11778) 2026-01-11 03:07:11 -08:00
comfyanonymous
2f642d5d9b Fix chroma fp8 te being treated as fp16. (#11795) 2026-01-10 14:40:42 -08:00
comfyanonymous
cd912963f1 Fix issue with t5 text encoder in fp4. (#11794) 2026-01-10 17:31:31 -05:00
DELUXA
6e4b1f9d00 pythorch_attn_by_def_on_gfx1200 (#11793) 2026-01-10 16:51:05 -05:00
comfyanonymous
dc202a2e51 Properly save mixed ops. (#11772) 2026-01-10 02:03:57 -05:00
ComfyUI Wiki
153bc524bf chore: update embedded docs to v0.4.0 (#11776) 2026-01-10 01:29:30 -05:00
Alexander Piskun
393d2880dd feat(api-nodes): added nodes for Vidu2 (#11760) 2026-01-09 12:59:38 -08:00
Alexander Piskun
4484b93d61 fix(api-nodes): do not downscale the input image for Topaz Enhance (#11768) 2026-01-09 12:25:56 -08:00
comfyanonymous
bd0e6825e8 Be less strict when loading mixed ops weights. (#11769) 2026-01-09 14:21:06 -05:00
67 changed files with 3961 additions and 379 deletions

View File

@@ -13,7 +13,7 @@ jobs:
- name: Checkout ComfyUI
uses: actions/checkout@v4
with:
repository: "comfyanonymous/ComfyUI"
repository: "Comfy-Org/ComfyUI"
path: "ComfyUI"
- uses: actions/setup-python@v4
with:

View File

@@ -0,0 +1,59 @@
name: "CI: Update CI Container"
on:
release:
types: [published]
workflow_dispatch:
inputs:
version:
description: 'ComfyUI version (e.g., v0.7.0)'
required: true
type: string
jobs:
update-ci-container:
runs-on: ubuntu-latest
# Skip pre-releases unless manually triggered
if: github.event_name == 'workflow_dispatch' || !github.event.release.prerelease
steps:
- name: Get version
id: version
run: |
if [ "${{ github.event_name }}" = "release" ]; then
VERSION="${{ github.event.release.tag_name }}"
else
VERSION="${{ inputs.version }}"
fi
echo "version=$VERSION" >> $GITHUB_OUTPUT
- name: Checkout comfyui-ci-container
uses: actions/checkout@v4
with:
repository: comfy-org/comfyui-ci-container
token: ${{ secrets.CI_CONTAINER_PAT }}
- name: Check current version
id: current
run: |
CURRENT=$(grep -oP 'ARG COMFYUI_VERSION=\K.*' Dockerfile || echo "unknown")
echo "current_version=$CURRENT" >> $GITHUB_OUTPUT
- name: Update Dockerfile
run: |
VERSION="${{ steps.version.outputs.version }}"
sed -i "s/^ARG COMFYUI_VERSION=.*/ARG COMFYUI_VERSION=${VERSION}/" Dockerfile
- name: Create Pull Request
id: create-pr
uses: peter-evans/create-pull-request@v7
with:
token: ${{ secrets.CI_CONTAINER_PAT }}
branch: automation/comfyui-${{ steps.version.outputs.version }}
title: "chore: bump ComfyUI to ${{ steps.version.outputs.version }}"
body: |
Updates ComfyUI version from `${{ steps.current.outputs.current_version }}` to `${{ steps.version.outputs.version }}`
**Triggered by:** ${{ github.event_name == 'release' && format('[Release {0}]({1})', github.event.release.tag_name, github.event.release.html_url) || 'Manual workflow dispatch' }}
labels: automation
commit-message: "chore: bump ComfyUI to ${{ steps.version.outputs.version }}"

View File

@@ -183,7 +183,7 @@ Simply download, extract with [7-Zip](https://7-zip.org) or with the windows exp
If you have trouble extracting it, right click the file -> properties -> unblock
Update your Nvidia drivers if it doesn't start.
The portable above currently comes with python 3.13 and pytorch cuda 13.0. Update your Nvidia drivers if it doesn't start.
#### Alternative Downloads:
@@ -212,7 +212,7 @@ Python 3.14 works but you may encounter issues with the torch compile node. The
Python 3.13 is very well supported. If you have trouble with some custom node dependencies on 3.13 you can try 3.12
torch 2.4 and above is supported but some features might only work on newer versions. We generally recommend using the latest major version of pytorch unless it is less than 2 weeks old.
torch 2.4 and above is supported but some features might only work on newer versions. We generally recommend using the latest major version of pytorch with the latest cuda version unless it is less than 2 weeks old.
### Instructions:

View File

@@ -92,14 +92,23 @@ def seed_from_paths_batch(
session.execute(ins_asset, chunk)
# try to claim AssetCacheState (file_path)
winners_by_path: set[str] = set()
# Insert with ON CONFLICT DO NOTHING, then query to find which paths were actually inserted
ins_state = (
sqlite.insert(AssetCacheState)
.on_conflict_do_nothing(index_elements=[AssetCacheState.file_path])
.returning(AssetCacheState.file_path)
)
for chunk in _iter_chunks(state_rows, _rows_per_stmt(3)):
winners_by_path.update((session.execute(ins_state, chunk)).scalars().all())
session.execute(ins_state, chunk)
# Query to find which of our paths won (were actually inserted)
winners_by_path: set[str] = set()
for chunk in _iter_chunks(path_list, MAX_BIND_PARAMS):
result = session.execute(
sqlalchemy.select(AssetCacheState.file_path)
.where(AssetCacheState.file_path.in_(chunk))
.where(AssetCacheState.asset_id.in_([path_to_asset[p] for p in chunk]))
)
winners_by_path.update(result.scalars().all())
all_paths_set = set(path_list)
losers_by_path = all_paths_set - winners_by_path
@@ -112,16 +121,23 @@ def seed_from_paths_batch(
return {"inserted_infos": 0, "won_states": 0, "lost_states": len(losers_by_path)}
# insert AssetInfo only for winners
# Insert with ON CONFLICT DO NOTHING, then query to find which were actually inserted
winner_info_rows = [asset_to_info[path_to_asset[p]] for p in winners_by_path]
ins_info = (
sqlite.insert(AssetInfo)
.on_conflict_do_nothing(index_elements=[AssetInfo.asset_id, AssetInfo.owner_id, AssetInfo.name])
.returning(AssetInfo.id)
)
inserted_info_ids: set[str] = set()
for chunk in _iter_chunks(winner_info_rows, _rows_per_stmt(9)):
inserted_info_ids.update((session.execute(ins_info, chunk)).scalars().all())
session.execute(ins_info, chunk)
# Query to find which info rows were actually inserted (by matching our generated IDs)
all_info_ids = [row["id"] for row in winner_info_rows]
inserted_info_ids: set[str] = set()
for chunk in _iter_chunks(all_info_ids, MAX_BIND_PARAMS):
result = session.execute(
sqlalchemy.select(AssetInfo.id).where(AssetInfo.id.in_(chunk))
)
inserted_info_ids.update(result.scalars().all())
# build and insert tag + meta rows for the AssetInfo
tag_rows: list[dict] = []

View File

@@ -10,6 +10,7 @@ import hashlib
class Source:
custom_node = "custom_node"
templates = "templates"
class SubgraphEntry(TypedDict):
source: str
@@ -38,6 +39,18 @@ class CustomNodeSubgraphEntryInfo(TypedDict):
class SubgraphManager:
def __init__(self):
self.cached_custom_node_subgraphs: dict[SubgraphEntry] | None = None
self.cached_blueprint_subgraphs: dict[SubgraphEntry] | None = None
def _create_entry(self, file: str, source: str, node_pack: str) -> tuple[str, SubgraphEntry]:
"""Create a subgraph entry from a file path. Expects normalized path (forward slashes)."""
entry_id = hashlib.sha256(f"{source}{file}".encode()).hexdigest()
entry: SubgraphEntry = {
"source": source,
"name": os.path.splitext(os.path.basename(file))[0],
"path": file,
"info": {"node_pack": node_pack},
}
return entry_id, entry
async def load_entry_data(self, entry: SubgraphEntry):
with open(entry['path'], 'r') as f:
@@ -60,53 +73,60 @@ class SubgraphManager:
return entries
async def get_custom_node_subgraphs(self, loadedModules, force_reload=False):
# if not forced to reload and cached, return cache
"""Load subgraphs from custom nodes."""
if not force_reload and self.cached_custom_node_subgraphs is not None:
return self.cached_custom_node_subgraphs
# Load subgraphs from custom nodes
subfolder = "subgraphs"
subgraphs_dict: dict[SubgraphEntry] = {}
subgraphs_dict: dict[SubgraphEntry] = {}
for folder in folder_paths.get_folder_paths("custom_nodes"):
pattern = os.path.join(folder, f"*/{subfolder}/*.json")
matched_files = glob.glob(pattern)
for file in matched_files:
# replace backslashes with forward slashes
pattern = os.path.join(folder, "*/subgraphs/*.json")
for file in glob.glob(pattern):
file = file.replace('\\', '/')
info: CustomNodeSubgraphEntryInfo = {
"node_pack": "custom_nodes." + file.split('/')[-3]
}
source = Source.custom_node
# hash source + path to make sure id will be as unique as possible, but
# reproducible across backend reloads
id = hashlib.sha256(f"{source}{file}".encode()).hexdigest()
entry: SubgraphEntry = {
"source": Source.custom_node,
"name": os.path.splitext(os.path.basename(file))[0],
"path": file,
"info": info,
}
subgraphs_dict[id] = entry
node_pack = "custom_nodes." + file.split('/')[-3]
entry_id, entry = self._create_entry(file, Source.custom_node, node_pack)
subgraphs_dict[entry_id] = entry
self.cached_custom_node_subgraphs = subgraphs_dict
return subgraphs_dict
async def get_custom_node_subgraph(self, id: str, loadedModules):
subgraphs = await self.get_custom_node_subgraphs(loadedModules)
entry: SubgraphEntry = subgraphs.get(id, None)
if entry is not None and entry.get('data', None) is None:
async def get_blueprint_subgraphs(self, force_reload=False):
"""Load subgraphs from the blueprints directory."""
if not force_reload and self.cached_blueprint_subgraphs is not None:
return self.cached_blueprint_subgraphs
subgraphs_dict: dict[SubgraphEntry] = {}
blueprints_dir = os.path.join(os.path.dirname(os.path.dirname(__file__)), 'blueprints')
if os.path.exists(blueprints_dir):
for file in glob.glob(os.path.join(blueprints_dir, "*.json")):
file = file.replace('\\', '/')
entry_id, entry = self._create_entry(file, Source.templates, "comfyui")
subgraphs_dict[entry_id] = entry
self.cached_blueprint_subgraphs = subgraphs_dict
return subgraphs_dict
async def get_all_subgraphs(self, loadedModules, force_reload=False):
"""Get all subgraphs from all sources (custom nodes and blueprints)."""
custom_node_subgraphs = await self.get_custom_node_subgraphs(loadedModules, force_reload)
blueprint_subgraphs = await self.get_blueprint_subgraphs(force_reload)
return {**custom_node_subgraphs, **blueprint_subgraphs}
async def get_subgraph(self, id: str, loadedModules):
"""Get a specific subgraph by ID from any source."""
entry = (await self.get_all_subgraphs(loadedModules)).get(id)
if entry is not None and entry.get('data') is None:
await self.load_entry_data(entry)
return entry
def add_routes(self, routes, loadedModules):
@routes.get("/global_subgraphs")
async def get_global_subgraphs(request):
subgraphs_dict = await self.get_custom_node_subgraphs(loadedModules)
# NOTE: we may want to include other sources of global subgraphs such as templates in the future;
# that's the reasoning for the current implementation
subgraphs_dict = await self.get_all_subgraphs(loadedModules)
return web.json_response(await self.sanitize_entries(subgraphs_dict, remove_data=True))
@routes.get("/global_subgraphs/{id}")
async def get_global_subgraph(request):
id = request.match_info.get("id", None)
subgraph = await self.get_custom_node_subgraph(id, loadedModules)
subgraph = await self.get_subgraph(id, loadedModules)
return web.json_response(await self.sanitize_entry(subgraph))

View File

View File

@@ -1,6 +1,7 @@
import torch
from comfy.ldm.modules.attention import optimized_attention_for_device
import comfy.ops
import math
def clip_preprocess(image, size=224, mean=[0.48145466, 0.4578275, 0.40821073], std=[0.26862954, 0.26130258, 0.27577711], crop=True):
image = image[:, :, :, :3] if image.shape[3] > 3 else image
@@ -21,6 +22,39 @@ def clip_preprocess(image, size=224, mean=[0.48145466, 0.4578275, 0.40821073], s
image = torch.clip((255. * image), 0, 255).round() / 255.0
return (image - mean.view([3,1,1])) / std.view([3,1,1])
def siglip2_flex_calc_resolution(oh, ow, patch_size, max_num_patches, eps=1e-5):
def scale_dim(size, scale):
scaled = math.ceil(size * scale / patch_size) * patch_size
return max(patch_size, int(scaled))
# Binary search for optimal scale
lo, hi = eps / 10, 100.0
while hi - lo >= eps:
mid = (lo + hi) / 2
h, w = scale_dim(oh, mid), scale_dim(ow, mid)
if (h // patch_size) * (w // patch_size) <= max_num_patches:
lo = mid
else:
hi = mid
return scale_dim(oh, lo), scale_dim(ow, lo)
def siglip2_preprocess(image, size, patch_size, num_patches, mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5], crop=True):
if size > 0:
return clip_preprocess(image, size=size, mean=mean, std=std, crop=crop)
image = image[:, :, :, :3] if image.shape[3] > 3 else image
mean = torch.tensor(mean, device=image.device, dtype=image.dtype)
std = torch.tensor(std, device=image.device, dtype=image.dtype)
image = image.movedim(-1, 1)
b, c, h, w = image.shape
h, w = siglip2_flex_calc_resolution(h, w, patch_size, num_patches)
image = torch.nn.functional.interpolate(image, size=(h, w), mode="bilinear", antialias=True)
image = torch.clip((255. * image), 0, 255).round() / 255.0
return (image - mean.view([3, 1, 1])) / std.view([3, 1, 1])
class CLIPAttention(torch.nn.Module):
def __init__(self, embed_dim, heads, dtype, device, operations):
super().__init__()
@@ -175,6 +209,27 @@ class CLIPTextModel(torch.nn.Module):
out = self.text_projection(x[2])
return (x[0], x[1], out, x[2])
def siglip2_pos_embed(embed_weight, embeds, orig_shape):
embed_weight_len = round(embed_weight.shape[0] ** 0.5)
embed_weight = comfy.ops.cast_to_input(embed_weight, embeds).movedim(1, 0).reshape(1, -1, embed_weight_len, embed_weight_len)
embed_weight = torch.nn.functional.interpolate(embed_weight, size=orig_shape, mode="bilinear", align_corners=False, antialias=True)
embed_weight = embed_weight.reshape(-1, embed_weight.shape[-2] * embed_weight.shape[-1]).movedim(0, 1)
return embeds + embed_weight
class Siglip2Embeddings(torch.nn.Module):
def __init__(self, embed_dim, num_channels=3, patch_size=14, image_size=224, model_type="", num_patches=None, dtype=None, device=None, operations=None):
super().__init__()
self.patch_embedding = operations.Linear(num_channels * patch_size * patch_size, embed_dim, dtype=dtype, device=device)
self.position_embedding = operations.Embedding(num_patches, embed_dim, dtype=dtype, device=device)
self.patch_size = patch_size
def forward(self, pixel_values):
b, c, h, w = pixel_values.shape
img = pixel_values.movedim(1, -1).reshape(b, h // self.patch_size, self.patch_size, w // self.patch_size, self.patch_size, c)
img = img.permute(0, 1, 3, 2, 4, 5)
img = img.reshape(b, img.shape[1] * img.shape[2], -1)
img = self.patch_embedding(img)
return siglip2_pos_embed(self.position_embedding.weight, img, (h // self.patch_size, w // self.patch_size))
class CLIPVisionEmbeddings(torch.nn.Module):
def __init__(self, embed_dim, num_channels=3, patch_size=14, image_size=224, model_type="", dtype=None, device=None, operations=None):
@@ -218,8 +273,11 @@ class CLIPVision(torch.nn.Module):
intermediate_activation = config_dict["hidden_act"]
model_type = config_dict["model_type"]
self.embeddings = CLIPVisionEmbeddings(embed_dim, config_dict["num_channels"], config_dict["patch_size"], config_dict["image_size"], model_type=model_type, dtype=dtype, device=device, operations=operations)
if model_type == "siglip_vision_model":
if model_type in ["siglip2_vision_model"]:
self.embeddings = Siglip2Embeddings(embed_dim, config_dict["num_channels"], config_dict["patch_size"], config_dict["image_size"], model_type=model_type, num_patches=config_dict.get("num_patches", None), dtype=dtype, device=device, operations=operations)
else:
self.embeddings = CLIPVisionEmbeddings(embed_dim, config_dict["num_channels"], config_dict["patch_size"], config_dict["image_size"], model_type=model_type, dtype=dtype, device=device, operations=operations)
if model_type in ["siglip_vision_model", "siglip2_vision_model"]:
self.pre_layrnorm = lambda a: a
self.output_layernorm = True
else:

View File

@@ -21,6 +21,7 @@ clip_preprocess = comfy.clip_model.clip_preprocess # Prevent some stuff from br
IMAGE_ENCODERS = {
"clip_vision_model": comfy.clip_model.CLIPVisionModelProjection,
"siglip_vision_model": comfy.clip_model.CLIPVisionModelProjection,
"siglip2_vision_model": comfy.clip_model.CLIPVisionModelProjection,
"dinov2": comfy.image_encoders.dino2.Dinov2Model,
}
@@ -32,9 +33,10 @@ class ClipVisionModel():
self.image_size = config.get("image_size", 224)
self.image_mean = config.get("image_mean", [0.48145466, 0.4578275, 0.40821073])
self.image_std = config.get("image_std", [0.26862954, 0.26130258, 0.27577711])
model_type = config.get("model_type", "clip_vision_model")
model_class = IMAGE_ENCODERS.get(model_type)
if model_type == "siglip_vision_model":
self.model_type = config.get("model_type", "clip_vision_model")
self.config = config.copy()
model_class = IMAGE_ENCODERS.get(self.model_type)
if self.model_type == "siglip_vision_model":
self.return_all_hidden_states = True
else:
self.return_all_hidden_states = False
@@ -55,12 +57,16 @@ class ClipVisionModel():
def encode_image(self, image, crop=True):
comfy.model_management.load_model_gpu(self.patcher)
pixel_values = comfy.clip_model.clip_preprocess(image.to(self.load_device), size=self.image_size, mean=self.image_mean, std=self.image_std, crop=crop).float()
if self.model_type == "siglip2_vision_model":
pixel_values = comfy.clip_model.siglip2_preprocess(image.to(self.load_device), size=self.image_size, patch_size=self.config.get("patch_size", 16), num_patches=self.config.get("num_patches", 256), mean=self.image_mean, std=self.image_std, crop=crop).float()
else:
pixel_values = comfy.clip_model.clip_preprocess(image.to(self.load_device), size=self.image_size, mean=self.image_mean, std=self.image_std, crop=crop).float()
out = self.model(pixel_values=pixel_values, intermediate_output='all' if self.return_all_hidden_states else -2)
outputs = Output()
outputs["last_hidden_state"] = out[0].to(comfy.model_management.intermediate_device())
outputs["image_embeds"] = out[2].to(comfy.model_management.intermediate_device())
outputs["image_sizes"] = [pixel_values.shape[1:]] * pixel_values.shape[0]
if self.return_all_hidden_states:
all_hs = out[1].to(comfy.model_management.intermediate_device())
outputs["penultimate_hidden_states"] = all_hs[:, -2]
@@ -107,10 +113,14 @@ def load_clipvision_from_sd(sd, prefix="", convert_keys=False):
elif "vision_model.encoder.layers.22.layer_norm1.weight" in sd:
embed_shape = sd["vision_model.embeddings.position_embedding.weight"].shape[0]
if sd["vision_model.encoder.layers.0.layer_norm1.weight"].shape[0] == 1152:
if embed_shape == 729:
json_config = os.path.join(os.path.dirname(os.path.realpath(__file__)), "clip_vision_siglip_384.json")
elif embed_shape == 1024:
json_config = os.path.join(os.path.dirname(os.path.realpath(__file__)), "clip_vision_siglip_512.json")
patch_embedding_shape = sd["vision_model.embeddings.patch_embedding.weight"].shape
if len(patch_embedding_shape) == 2:
json_config = os.path.join(os.path.dirname(os.path.realpath(__file__)), "clip_vision_siglip2_base_naflex.json")
else:
if embed_shape == 729:
json_config = os.path.join(os.path.dirname(os.path.realpath(__file__)), "clip_vision_siglip_384.json")
elif embed_shape == 1024:
json_config = os.path.join(os.path.dirname(os.path.realpath(__file__)), "clip_vision_siglip_512.json")
elif embed_shape == 577:
if "multi_modal_projector.linear_1.bias" in sd:
json_config = os.path.join(os.path.dirname(os.path.realpath(__file__)), "clip_vision_config_vitl_336_llava.json")

View File

@@ -0,0 +1,14 @@
{
"num_channels": 3,
"hidden_act": "gelu_pytorch_tanh",
"hidden_size": 1152,
"image_size": -1,
"intermediate_size": 4304,
"model_type": "siglip2_vision_model",
"num_attention_heads": 16,
"num_hidden_layers": 27,
"patch_size": 16,
"num_patches": 256,
"image_mean": [0.5, 0.5, 0.5],
"image_std": [0.5, 0.5, 0.5]
}

View File

@@ -65,3 +65,147 @@ def stochastic_rounding(value, dtype, seed=0):
return output
return value.to(dtype=dtype)
# TODO: improve this?
def stochastic_float_to_fp4_e2m1(x, generator):
orig_shape = x.shape
sign = torch.signbit(x).to(torch.uint8)
exp = torch.floor(torch.log2(x.abs()) + 1.0).clamp(0, 3)
x += (torch.rand(x.size(), dtype=x.dtype, layout=x.layout, device=x.device, generator=generator) - 0.5) * (2 ** (exp - 2.0)) * 1.25
x = x.abs()
exp = torch.floor(torch.log2(x) + 1.1925).clamp(0, 3)
mantissa = torch.where(
exp > 0,
(x / (2.0 ** (exp - 1)) - 1.0) * 2.0,
(x * 2.0),
out=x
).round().to(torch.uint8)
del x
exp = exp.to(torch.uint8)
fp4 = (sign << 3) | (exp << 1) | mantissa
del sign, exp, mantissa
fp4_flat = fp4.view(-1)
packed = (fp4_flat[0::2] << 4) | fp4_flat[1::2]
return packed.reshape(list(orig_shape)[:-1] + [-1])
def to_blocked(input_matrix, flatten: bool = True) -> torch.Tensor:
"""
Rearrange a large matrix by breaking it into blocks and applying the rearrangement pattern.
See:
https://docs.nvidia.com/cuda/cublas/index.html#d-block-scaling-factors-layout
Args:
input_matrix: Input tensor of shape (H, W)
Returns:
Rearranged tensor of shape (32*ceil_div(H,128), 16*ceil_div(W,4))
"""
def ceil_div(a, b):
return (a + b - 1) // b
rows, cols = input_matrix.shape
n_row_blocks = ceil_div(rows, 128)
n_col_blocks = ceil_div(cols, 4)
# Calculate the padded shape
padded_rows = n_row_blocks * 128
padded_cols = n_col_blocks * 4
padded = input_matrix
if (rows, cols) != (padded_rows, padded_cols):
padded = torch.zeros(
(padded_rows, padded_cols),
device=input_matrix.device,
dtype=input_matrix.dtype,
)
padded[:rows, :cols] = input_matrix
# Rearrange the blocks
blocks = padded.view(n_row_blocks, 128, n_col_blocks, 4).permute(0, 2, 1, 3)
rearranged = blocks.reshape(-1, 4, 32, 4).transpose(1, 2).reshape(-1, 32, 16)
if flatten:
return rearranged.flatten()
return rearranged.reshape(padded_rows, padded_cols)
def stochastic_round_quantize_nvfp4_block(x, per_tensor_scale, generator):
F4_E2M1_MAX = 6.0
F8_E4M3_MAX = 448.0
orig_shape = x.shape
block_size = 16
x = x.reshape(orig_shape[0], -1, block_size)
scaled_block_scales_fp8 = torch.clamp(((torch.amax(torch.abs(x), dim=-1)) / F4_E2M1_MAX) / per_tensor_scale.to(x.dtype), max=F8_E4M3_MAX).to(torch.float8_e4m3fn)
x = x / (per_tensor_scale.to(x.dtype) * scaled_block_scales_fp8.to(x.dtype)).unsqueeze(-1)
x = x.view(orig_shape).nan_to_num()
data_lp = stochastic_float_to_fp4_e2m1(x, generator=generator)
return data_lp, scaled_block_scales_fp8
def stochastic_round_quantize_nvfp4(x, per_tensor_scale, pad_16x, seed=0):
def roundup(x: int, multiple: int) -> int:
"""Round up x to the nearest multiple."""
return ((x + multiple - 1) // multiple) * multiple
generator = torch.Generator(device=x.device)
generator.manual_seed(seed)
# Handle padding
if pad_16x:
rows, cols = x.shape
padded_rows = roundup(rows, 16)
padded_cols = roundup(cols, 16)
if padded_rows != rows or padded_cols != cols:
x = torch.nn.functional.pad(x, (0, padded_cols - cols, 0, padded_rows - rows))
x, blocked_scaled = stochastic_round_quantize_nvfp4_block(x, per_tensor_scale, generator)
return x, to_blocked(blocked_scaled, flatten=False)
def stochastic_round_quantize_nvfp4_by_block(x, per_tensor_scale, pad_16x, seed=0, block_size=4096 * 4096):
def roundup(x: int, multiple: int) -> int:
"""Round up x to the nearest multiple."""
return ((x + multiple - 1) // multiple) * multiple
orig_shape = x.shape
# Handle padding
if pad_16x:
rows, cols = x.shape
padded_rows = roundup(rows, 16)
padded_cols = roundup(cols, 16)
if padded_rows != rows or padded_cols != cols:
x = torch.nn.functional.pad(x, (0, padded_cols - cols, 0, padded_rows - rows))
# Note: We update orig_shape because the output tensor logic below assumes x.shape matches
# what we want to produce. If we pad here, we want the padded output.
orig_shape = x.shape
orig_shape = list(orig_shape)
output_fp4 = torch.empty(orig_shape[:-1] + [orig_shape[-1] // 2], dtype=torch.uint8, device=x.device)
output_block = torch.empty(orig_shape[:-1] + [orig_shape[-1] // 16], dtype=torch.float8_e4m3fn, device=x.device)
generator = torch.Generator(device=x.device)
generator.manual_seed(seed)
num_slices = max(1, (x.numel() / block_size))
slice_size = max(1, (round(x.shape[0] / num_slices)))
for i in range(0, x.shape[0], slice_size):
fp4, block = stochastic_round_quantize_nvfp4_block(x[i: i + slice_size], per_tensor_scale, generator=generator)
output_fp4[i:i + slice_size].copy_(fp4)
output_block[i:i + slice_size].copy_(block)
return output_fp4, to_blocked(output_block, flatten=False)

View File

@@ -11,6 +11,69 @@ from comfy.ldm.lightricks.model import (
from comfy.ldm.lightricks.symmetric_patchifier import AudioPatchifier
import comfy.ldm.common_dit
class CompressedTimestep:
"""Store video timestep embeddings in compressed form using per-frame indexing."""
__slots__ = ('data', 'batch_size', 'num_frames', 'patches_per_frame', 'feature_dim')
def __init__(self, tensor: torch.Tensor, patches_per_frame: int):
"""
tensor: [batch_size, num_tokens, feature_dim] tensor where num_tokens = num_frames * patches_per_frame
patches_per_frame: Number of spatial patches per frame (height * width in latent space)
"""
self.batch_size, num_tokens, self.feature_dim = tensor.shape
# Check if compression is valid (num_tokens must be divisible by patches_per_frame)
if num_tokens % patches_per_frame == 0 and num_tokens >= patches_per_frame:
self.patches_per_frame = patches_per_frame
self.num_frames = num_tokens // patches_per_frame
# Reshape to [batch, frames, patches_per_frame, feature_dim] and store one value per frame
# All patches in a frame are identical, so we only keep the first one
reshaped = tensor.view(self.batch_size, self.num_frames, patches_per_frame, self.feature_dim)
self.data = reshaped[:, :, 0, :].contiguous() # [batch, frames, feature_dim]
else:
# Not divisible or too small - store directly without compression
self.patches_per_frame = 1
self.num_frames = num_tokens
self.data = tensor
def expand(self):
"""Expand back to original tensor."""
if self.patches_per_frame == 1:
return self.data
# [batch, frames, feature_dim] -> [batch, frames, patches_per_frame, feature_dim] -> [batch, tokens, feature_dim]
expanded = self.data.unsqueeze(2).expand(self.batch_size, self.num_frames, self.patches_per_frame, self.feature_dim)
return expanded.reshape(self.batch_size, -1, self.feature_dim)
def expand_for_computation(self, scale_shift_table: torch.Tensor, batch_size: int, indices: slice = slice(None, None)):
"""Compute ada values on compressed per-frame data, then expand spatially."""
num_ada_params = scale_shift_table.shape[0]
# No compression - compute directly
if self.patches_per_frame == 1:
num_tokens = self.data.shape[1]
dim_per_param = self.feature_dim // num_ada_params
reshaped = self.data.reshape(batch_size, num_tokens, num_ada_params, dim_per_param)[:, :, indices, :]
table_values = scale_shift_table[indices].unsqueeze(0).unsqueeze(0).to(device=self.data.device, dtype=self.data.dtype)
ada_values = (table_values + reshaped).unbind(dim=2)
return ada_values
# Compressed: compute on per-frame data then expand spatially
# Reshape: [batch, frames, feature_dim] -> [batch, frames, num_ada_params, dim_per_param]
frame_reshaped = self.data.reshape(batch_size, self.num_frames, num_ada_params, -1)[:, :, indices, :]
table_values = scale_shift_table[indices].unsqueeze(0).unsqueeze(0).to(
device=self.data.device, dtype=self.data.dtype
)
frame_ada = (table_values + frame_reshaped).unbind(dim=2)
# Expand each ada parameter spatially: [batch, frames, dim] -> [batch, frames, patches, dim] -> [batch, tokens, dim]
return tuple(
frame_val.unsqueeze(2).expand(batch_size, self.num_frames, self.patches_per_frame, -1)
.reshape(batch_size, -1, frame_val.shape[-1])
for frame_val in frame_ada
)
class BasicAVTransformerBlock(nn.Module):
def __init__(
self,
@@ -119,6 +182,9 @@ class BasicAVTransformerBlock(nn.Module):
def get_ada_values(
self, scale_shift_table: torch.Tensor, batch_size: int, timestep: torch.Tensor, indices: slice = slice(None, None)
):
if isinstance(timestep, CompressedTimestep):
return timestep.expand_for_computation(scale_shift_table, batch_size, indices)
num_ada_params = scale_shift_table.shape[0]
ada_values = (
@@ -146,10 +212,7 @@ class BasicAVTransformerBlock(nn.Module):
gate_timestep,
)
scale_shift_chunks = [t.squeeze(2) for t in scale_shift_ada_values]
gate_ada_values = [t.squeeze(2) for t in gate_ada_values]
return (*scale_shift_chunks, *gate_ada_values)
return (*scale_shift_ada_values, *gate_ada_values)
def forward(
self,
@@ -543,72 +606,80 @@ class LTXAVModel(LTXVModel):
if grid_mask is not None:
timestep = timestep[:, grid_mask]
timestep = timestep * self.timestep_scale_multiplier
timestep_scaled = timestep * self.timestep_scale_multiplier
v_timestep, v_embedded_timestep = self.adaln_single(
timestep.flatten(),
timestep_scaled.flatten(),
{"resolution": None, "aspect_ratio": None},
batch_size=batch_size,
hidden_dtype=hidden_dtype,
)
# Second dimension is 1 or number of tokens (if timestep_per_token)
v_timestep = v_timestep.view(batch_size, -1, v_timestep.shape[-1])
v_embedded_timestep = v_embedded_timestep.view(
batch_size, -1, v_embedded_timestep.shape[-1]
)
# Calculate patches_per_frame from orig_shape: [batch, channels, frames, height, width]
# Video tokens are arranged as (frames * height * width), so patches_per_frame = height * width
orig_shape = kwargs.get("orig_shape")
v_patches_per_frame = None
if orig_shape is not None and len(orig_shape) == 5:
# orig_shape[3] = height, orig_shape[4] = width (in latent space)
v_patches_per_frame = orig_shape[3] * orig_shape[4]
# Reshape to [batch_size, num_tokens, dim] and compress for storage
v_timestep = CompressedTimestep(v_timestep.view(batch_size, -1, v_timestep.shape[-1]), v_patches_per_frame)
v_embedded_timestep = CompressedTimestep(v_embedded_timestep.view(batch_size, -1, v_embedded_timestep.shape[-1]), v_patches_per_frame)
# Prepare audio timestep
a_timestep = kwargs.get("a_timestep")
if a_timestep is not None:
a_timestep = a_timestep * self.timestep_scale_multiplier
a_timestep_scaled = a_timestep * self.timestep_scale_multiplier
a_timestep_flat = a_timestep_scaled.flatten()
timestep_flat = timestep_scaled.flatten()
av_ca_factor = self.av_ca_timestep_scale_multiplier / self.timestep_scale_multiplier
# Cross-attention timesteps - compress these too
av_ca_audio_scale_shift_timestep, _ = self.av_ca_audio_scale_shift_adaln_single(
a_timestep.flatten(),
a_timestep_flat,
{"resolution": None, "aspect_ratio": None},
batch_size=batch_size,
hidden_dtype=hidden_dtype,
)
av_ca_video_scale_shift_timestep, _ = self.av_ca_video_scale_shift_adaln_single(
timestep.flatten(),
timestep_flat,
{"resolution": None, "aspect_ratio": None},
batch_size=batch_size,
hidden_dtype=hidden_dtype,
)
av_ca_a2v_gate_noise_timestep, _ = self.av_ca_a2v_gate_adaln_single(
timestep.flatten() * av_ca_factor,
timestep_flat * av_ca_factor,
{"resolution": None, "aspect_ratio": None},
batch_size=batch_size,
hidden_dtype=hidden_dtype,
)
av_ca_v2a_gate_noise_timestep, _ = self.av_ca_v2a_gate_adaln_single(
a_timestep.flatten() * av_ca_factor,
a_timestep_flat * av_ca_factor,
{"resolution": None, "aspect_ratio": None},
batch_size=batch_size,
hidden_dtype=hidden_dtype,
)
# Compress cross-attention timesteps (only video side, audio is too small to benefit)
cross_av_timestep_ss = [
av_ca_audio_scale_shift_timestep.view(batch_size, -1, av_ca_audio_scale_shift_timestep.shape[-1]),
CompressedTimestep(av_ca_video_scale_shift_timestep.view(batch_size, -1, av_ca_video_scale_shift_timestep.shape[-1]), v_patches_per_frame), # video - compressed
CompressedTimestep(av_ca_a2v_gate_noise_timestep.view(batch_size, -1, av_ca_a2v_gate_noise_timestep.shape[-1]), v_patches_per_frame), # video - compressed
av_ca_v2a_gate_noise_timestep.view(batch_size, -1, av_ca_v2a_gate_noise_timestep.shape[-1]),
]
a_timestep, a_embedded_timestep = self.audio_adaln_single(
a_timestep.flatten(),
a_timestep_flat,
{"resolution": None, "aspect_ratio": None},
batch_size=batch_size,
hidden_dtype=hidden_dtype,
)
# Audio timesteps
a_timestep = a_timestep.view(batch_size, -1, a_timestep.shape[-1])
a_embedded_timestep = a_embedded_timestep.view(
batch_size, -1, a_embedded_timestep.shape[-1]
)
cross_av_timestep_ss = [
av_ca_audio_scale_shift_timestep,
av_ca_video_scale_shift_timestep,
av_ca_a2v_gate_noise_timestep,
av_ca_v2a_gate_noise_timestep,
]
cross_av_timestep_ss = list(
[t.view(batch_size, -1, t.shape[-1]) for t in cross_av_timestep_ss]
)
a_embedded_timestep = a_embedded_timestep.view(batch_size, -1, a_embedded_timestep.shape[-1])
else:
a_timestep = timestep
a_timestep = timestep_scaled
a_embedded_timestep = kwargs.get("embedded_timestep")
cross_av_timestep_ss = []
@@ -767,6 +838,11 @@ class LTXAVModel(LTXVModel):
ax = x[1]
v_embedded_timestep = embedded_timestep[0]
a_embedded_timestep = embedded_timestep[1]
# Expand compressed video timestep if needed
if isinstance(v_embedded_timestep, CompressedTimestep):
v_embedded_timestep = v_embedded_timestep.expand()
vx = super()._process_output(vx, v_embedded_timestep, keyframe_idxs, **kwargs)
# Process audio output

View File

@@ -322,6 +322,7 @@ def model_lora_keys_unet(model, key_map={}):
key_map["diffusion_model.{}".format(key_lora)] = to
key_map["transformer.{}".format(key_lora)] = to
key_map["lycoris_{}".format(key_lora.replace(".", "_"))] = to
key_map[key_lora] = to
if isinstance(model, comfy.model_base.Kandinsky5):
for k in sdk:

View File

@@ -237,6 +237,8 @@ def detect_unet_config(state_dict, key_prefix, metadata=None):
else:
dit_config["vec_in_dim"] = None
dit_config["num_heads"] = dit_config["hidden_size"] // sum(dit_config["axes_dim"])
dit_config["depth"] = count_blocks(state_dict_keys, '{}double_blocks.'.format(key_prefix) + '{}.')
dit_config["depth_single_blocks"] = count_blocks(state_dict_keys, '{}single_blocks.'.format(key_prefix) + '{}.')
if '{}distilled_guidance_layer.0.norms.0.scale'.format(key_prefix) in state_dict_keys or '{}distilled_guidance_layer.norms.0.scale'.format(key_prefix) in state_dict_keys: #Chroma

View File

@@ -368,7 +368,7 @@ try:
if any((a in arch) for a in ["gfx90a", "gfx942", "gfx1100", "gfx1101", "gfx1151"]): # TODO: more arches, TODO: gfx950
ENABLE_PYTORCH_ATTENTION = True
if rocm_version >= (7, 0):
if any((a in arch) for a in ["gfx1201"]):
if any((a in arch) for a in ["gfx1200", "gfx1201"]):
ENABLE_PYTORCH_ATTENTION = True
if torch_version_numeric >= (2, 7) and rocm_version >= (6, 4):
if any((a in arch) for a in ["gfx1200", "gfx1201", "gfx950"]): # TODO: more arches, "gfx942" gives error on pytorch nightly 2.10 1013 rocm7.0

View File

@@ -546,7 +546,8 @@ def mixed_precision_ops(quant_config={}, compute_dtype=torch.bfloat16, full_prec
weight_key = f"{prefix}weight"
weight = state_dict.pop(weight_key, None)
if weight is None:
raise ValueError(f"Missing weight for layer {layer_name}")
logging.warning(f"Missing weight for layer {layer_name}")
return
manually_loaded_keys = [weight_key]
@@ -624,21 +625,29 @@ def mixed_precision_ops(quant_config={}, compute_dtype=torch.bfloat16, full_prec
missing_keys.remove(key)
def state_dict(self, *args, destination=None, prefix="", **kwargs):
sd = super().state_dict(*args, destination=destination, prefix=prefix, **kwargs)
if isinstance(self.weight, QuantizedTensor):
layout_cls = self.weight._layout_cls
if destination is not None:
sd = destination
else:
sd = {}
# Check if it's any FP8 variant (E4M3 or E5M2)
if layout_cls in ("TensorCoreFP8E4M3Layout", "TensorCoreFP8E5M2Layout", "TensorCoreFP8Layout"):
sd["{}weight_scale".format(prefix)] = self.weight._params.scale
elif layout_cls == "TensorCoreNVFP4Layout":
sd["{}weight_scale_2".format(prefix)] = self.weight._params.scale
sd["{}weight_scale".format(prefix)] = self.weight._params.block_scale
if self.bias is not None:
sd["{}bias".format(prefix)] = self.bias
if isinstance(self.weight, QuantizedTensor):
sd_out = self.weight.state_dict("{}weight".format(prefix))
for k in sd_out:
sd[k] = sd_out[k]
quant_conf = {"format": self.quant_format}
if self._full_precision_mm_config:
quant_conf["full_precision_matrix_mult"] = True
sd["{}comfy_quant".format(prefix)] = torch.tensor(list(json.dumps(quant_conf).encode('utf-8')), dtype=torch.uint8)
input_scale = getattr(self, 'input_scale', None)
if input_scale is not None:
sd["{}input_scale".format(prefix)] = input_scale
else:
sd["{}weight".format(prefix)] = self.weight
return sd
def _forward(self, input, weight, bias):
@@ -690,7 +699,7 @@ def mixed_precision_ops(quant_config={}, compute_dtype=torch.bfloat16, full_prec
def set_weight(self, weight, inplace_update=False, seed=None, return_weight=False, **kwargs):
if getattr(self, 'layout_type', None) is not None:
# dtype is now implicit in the layout class
weight = QuantizedTensor.from_float(weight, self.layout_type, scale="recalculate", stochastic_rounding=seed, inplace_ops=True)
weight = QuantizedTensor.from_float(weight, self.layout_type, scale="recalculate", stochastic_rounding=seed, inplace_ops=True).to(self.weight.dtype)
else:
weight = weight.to(self.weight.dtype)
if return_weight:

View File

@@ -7,7 +7,7 @@ try:
QuantizedTensor,
QuantizedLayout,
TensorCoreFP8Layout as _CKFp8Layout,
TensorCoreNVFP4Layout, # Direct import, no wrapper needed
TensorCoreNVFP4Layout as _CKNvfp4Layout,
register_layout_op,
register_layout_class,
get_layout_class,
@@ -34,7 +34,7 @@ except ImportError as e:
class _CKFp8Layout:
pass
class TensorCoreNVFP4Layout:
class _CKNvfp4Layout:
pass
def register_layout_class(name, cls):
@@ -84,6 +84,39 @@ class _TensorCoreFP8LayoutBase(_CKFp8Layout):
return qdata, params
class TensorCoreNVFP4Layout(_CKNvfp4Layout):
@classmethod
def quantize(cls, tensor, scale=None, stochastic_rounding=0, inplace_ops=False):
if tensor.dim() != 2:
raise ValueError(f"NVFP4 requires 2D tensor, got {tensor.dim()}D")
orig_dtype = tensor.dtype
orig_shape = tuple(tensor.shape)
if scale is None or (isinstance(scale, str) and scale == "recalculate"):
scale = torch.amax(tensor.abs()) / (ck.float_utils.F8_E4M3_MAX * ck.float_utils.F4_E2M1_MAX)
if not isinstance(scale, torch.Tensor):
scale = torch.tensor(scale)
scale = scale.to(device=tensor.device, dtype=torch.float32)
padded_shape = cls.get_padded_shape(orig_shape)
needs_padding = padded_shape != orig_shape
if stochastic_rounding > 0:
qdata, block_scale = comfy.float.stochastic_round_quantize_nvfp4_by_block(tensor, scale, pad_16x=needs_padding, seed=stochastic_rounding)
else:
qdata, block_scale = ck.quantize_nvfp4(tensor, scale, pad_16x=needs_padding)
params = cls.Params(
scale=scale,
orig_dtype=orig_dtype,
orig_shape=orig_shape,
block_scale=block_scale,
)
return qdata, params
class TensorCoreFP8E4M3Layout(_TensorCoreFP8LayoutBase):
FP8_DTYPE = torch.float8_e4m3fn

View File

@@ -1014,6 +1014,7 @@ class CLIPType(Enum):
KANDINSKY5 = 22
KANDINSKY5_IMAGE = 23
NEWBIE = 24
FLUX2 = 25
def load_clip(ckpt_paths, embedding_directory=None, clip_type=CLIPType.STABLE_DIFFUSION, model_options={}):
@@ -1046,6 +1047,7 @@ class TEModel(Enum):
QWEN3_2B = 17
GEMMA_3_12B = 18
JINA_CLIP_2 = 19
QWEN3_8B = 20
def detect_te_model(sd):
@@ -1059,9 +1061,9 @@ def detect_te_model(sd):
return TEModel.JINA_CLIP_2
if "encoder.block.23.layer.1.DenseReluDense.wi_1.weight" in sd:
weight = sd["encoder.block.23.layer.1.DenseReluDense.wi_1.weight"]
if weight.shape[-1] == 4096:
if weight.shape[0] == 10240:
return TEModel.T5_XXL
elif weight.shape[-1] == 2048:
elif weight.shape[0] == 5120:
return TEModel.T5_XL
if 'encoder.block.23.layer.1.DenseReluDense.wi.weight' in sd:
return TEModel.T5_XXL_OLD
@@ -1089,6 +1091,8 @@ def detect_te_model(sd):
return TEModel.QWEN3_4B
elif weight.shape[0] == 2048:
return TEModel.QWEN3_2B
elif weight.shape[0] == 4096:
return TEModel.QWEN3_8B
if weight.shape[0] == 5120:
if "model.layers.39.post_attention_layernorm.weight" in sd:
return TEModel.MISTRAL3_24B
@@ -1214,11 +1218,18 @@ def load_text_encoder_state_dicts(state_dicts=[], embedding_directory=None, clip
clip_target.tokenizer = comfy.text_encoders.flux.Flux2Tokenizer
tokenizer_data["tekken_model"] = clip_data[0].get("tekken_model", None)
elif te_model == TEModel.QWEN3_4B:
clip_target.clip = comfy.text_encoders.z_image.te(**llama_detect(clip_data))
clip_target.tokenizer = comfy.text_encoders.z_image.ZImageTokenizer
if clip_type == CLIPType.FLUX or clip_type == CLIPType.FLUX2:
clip_target.clip = comfy.text_encoders.flux.klein_te(**llama_detect(clip_data), model_type="qwen3_4b")
clip_target.tokenizer = comfy.text_encoders.flux.KleinTokenizer
else:
clip_target.clip = comfy.text_encoders.z_image.te(**llama_detect(clip_data))
clip_target.tokenizer = comfy.text_encoders.z_image.ZImageTokenizer
elif te_model == TEModel.QWEN3_2B:
clip_target.clip = comfy.text_encoders.ovis.te(**llama_detect(clip_data))
clip_target.tokenizer = comfy.text_encoders.ovis.OvisTokenizer
elif te_model == TEModel.QWEN3_8B:
clip_target.clip = comfy.text_encoders.flux.klein_te(**llama_detect(clip_data), model_type="qwen3_8b")
clip_target.tokenizer = comfy.text_encoders.flux.KleinTokenizer8B
elif te_model == TEModel.JINA_CLIP_2:
clip_target.clip = comfy.text_encoders.jina_clip_2.JinaClip2TextModelWrapper
clip_target.tokenizer = comfy.text_encoders.jina_clip_2.JinaClip2TokenizerWrapper

View File

@@ -763,7 +763,7 @@ class Flux2(Flux):
def __init__(self, unet_config):
super().__init__(unet_config)
self.memory_usage_factor = self.memory_usage_factor * (2.0 * 2.0) * 2.36
self.memory_usage_factor = self.memory_usage_factor * (2.0 * 2.0) * (unet_config['hidden_size'] / 2604)
def get_model(self, state_dict, prefix="", device=None):
out = model_base.Flux2(self, device=device)
@@ -845,7 +845,7 @@ class LTXAV(LTXV):
def __init__(self, unet_config):
super().__init__(unet_config)
self.memory_usage_factor = 0.061 # TODO
self.memory_usage_factor = 0.077 # TODO
def get_model(self, state_dict, prefix="", device=None):
out = model_base.LTXAV(self, device=device)
@@ -1042,7 +1042,7 @@ class ZImage(Lumina2):
"shift": 3.0,
}
memory_usage_factor = 2.0
memory_usage_factor = 2.8
supported_inference_dtypes = [torch.bfloat16, torch.float32]

View File

@@ -36,7 +36,7 @@ def te(dtype_t5=None, t5_quantization_metadata=None):
if t5_quantization_metadata is not None:
model_options = model_options.copy()
model_options["t5xxl_quantization_metadata"] = t5_quantization_metadata
if dtype is None:
if dtype_t5 is not None:
dtype = dtype_t5
super().__init__(device=device, dtype=dtype, model_options=model_options)
return CosmosTEModel_

View File

@@ -3,7 +3,7 @@ import comfy.text_encoders.t5
import comfy.text_encoders.sd3_clip
import comfy.text_encoders.llama
import comfy.model_management
from transformers import T5TokenizerFast, LlamaTokenizerFast
from transformers import T5TokenizerFast, LlamaTokenizerFast, Qwen2Tokenizer
import torch
import os
import json
@@ -172,3 +172,60 @@ def flux2_te(dtype_llama=None, llama_quantization_metadata=None, pruned=False):
model_options["num_layers"] = 30
super().__init__(device=device, dtype=dtype, model_options=model_options)
return Flux2TEModel_
class Qwen3Tokenizer(sd1_clip.SDTokenizer):
def __init__(self, embedding_directory=None, tokenizer_data={}):
tokenizer_path = os.path.join(os.path.dirname(os.path.realpath(__file__)), "qwen25_tokenizer")
super().__init__(tokenizer_path, pad_with_end=False, embedding_size=2560, embedding_key='qwen3_4b', tokenizer_class=Qwen2Tokenizer, has_start_token=False, has_end_token=False, pad_to_max_length=False, max_length=99999999, min_length=512, pad_token=151643, tokenizer_data=tokenizer_data)
class Qwen3Tokenizer8B(sd1_clip.SDTokenizer):
def __init__(self, embedding_directory=None, tokenizer_data={}):
tokenizer_path = os.path.join(os.path.dirname(os.path.realpath(__file__)), "qwen25_tokenizer")
super().__init__(tokenizer_path, pad_with_end=False, embedding_size=4096, embedding_key='qwen3_8b', tokenizer_class=Qwen2Tokenizer, has_start_token=False, has_end_token=False, pad_to_max_length=False, max_length=99999999, min_length=512, pad_token=151643, tokenizer_data=tokenizer_data)
class KleinTokenizer(sd1_clip.SD1Tokenizer):
def __init__(self, embedding_directory=None, tokenizer_data={}, name="qwen3_4b"):
if name == "qwen3_4b":
tokenizer = Qwen3Tokenizer
elif name == "qwen3_8b":
tokenizer = Qwen3Tokenizer8B
super().__init__(embedding_directory=embedding_directory, tokenizer_data=tokenizer_data, name=name, tokenizer=tokenizer)
self.llama_template = "<|im_start|>user\n{}<|im_end|>\n<|im_start|>assistant\n<think>\n\n</think>\n\n"
def tokenize_with_weights(self, text, return_word_ids=False, llama_template=None, **kwargs):
if llama_template is None:
llama_text = self.llama_template.format(text)
else:
llama_text = llama_template.format(text)
tokens = super().tokenize_with_weights(llama_text, return_word_ids=return_word_ids, disable_weights=True, **kwargs)
return tokens
class KleinTokenizer8B(KleinTokenizer):
def __init__(self, embedding_directory=None, tokenizer_data={}, name="qwen3_8b"):
super().__init__(embedding_directory=embedding_directory, tokenizer_data=tokenizer_data, name=name)
class Qwen3_4BModel(sd1_clip.SDClipModel):
def __init__(self, device="cpu", layer=[9, 18, 27], layer_idx=None, dtype=None, attention_mask=True, model_options={}):
super().__init__(device=device, layer=layer, layer_idx=layer_idx, textmodel_json_config={}, dtype=dtype, special_tokens={"pad": 151643}, layer_norm_hidden_state=False, model_class=comfy.text_encoders.llama.Qwen3_4B, enable_attention_masks=attention_mask, return_attention_masks=attention_mask, model_options=model_options)
class Qwen3_8BModel(sd1_clip.SDClipModel):
def __init__(self, device="cpu", layer=[9, 18, 27], layer_idx=None, dtype=None, attention_mask=True, model_options={}):
super().__init__(device=device, layer=layer, layer_idx=layer_idx, textmodel_json_config={}, dtype=dtype, special_tokens={"pad": 151643}, layer_norm_hidden_state=False, model_class=comfy.text_encoders.llama.Qwen3_8B, enable_attention_masks=attention_mask, return_attention_masks=attention_mask, model_options=model_options)
def klein_te(dtype_llama=None, llama_quantization_metadata=None, model_type="qwen3_4b"):
if model_type == "qwen3_4b":
model = Qwen3_4BModel
elif model_type == "qwen3_8b":
model = Qwen3_8BModel
class Flux2TEModel_(Flux2TEModel):
def __init__(self, device="cpu", dtype=None, model_options={}):
if llama_quantization_metadata is not None:
model_options = model_options.copy()
model_options["quantization_metadata"] = llama_quantization_metadata
if dtype_llama is not None:
dtype = dtype_llama
super().__init__(device=device, dtype=dtype, name=model_type, model_options=model_options, clip_model=model)
return Flux2TEModel_

View File

@@ -32,7 +32,7 @@ def mochi_te(dtype_t5=None, t5_quantization_metadata=None):
if t5_quantization_metadata is not None:
model_options = model_options.copy()
model_options["t5xxl_quantization_metadata"] = t5_quantization_metadata
if dtype is None:
if dtype_t5 is not None:
dtype = dtype_t5
super().__init__(device=device, dtype=dtype, model_options=model_options)
return MochiTEModel_

View File

@@ -99,6 +99,28 @@ class Qwen3_4BConfig:
rope_scale = None
final_norm: bool = True
@dataclass
class Qwen3_8BConfig:
vocab_size: int = 151936
hidden_size: int = 4096
intermediate_size: int = 12288
num_hidden_layers: int = 36
num_attention_heads: int = 32
num_key_value_heads: int = 8
max_position_embeddings: int = 40960
rms_norm_eps: float = 1e-6
rope_theta: float = 1000000.0
transformer_type: str = "llama"
head_dim = 128
rms_norm_add = False
mlp_activation = "silu"
qkv_bias = False
rope_dims = None
q_norm = "gemma3"
k_norm = "gemma3"
rope_scale = None
final_norm: bool = True
@dataclass
class Ovis25_2BConfig:
vocab_size: int = 151936
@@ -628,6 +650,15 @@ class Qwen3_4B(BaseLlama, torch.nn.Module):
self.model = Llama2_(config, device=device, dtype=dtype, ops=operations)
self.dtype = dtype
class Qwen3_8B(BaseLlama, torch.nn.Module):
def __init__(self, config_dict, dtype, device, operations):
super().__init__()
config = Qwen3_8BConfig(**config_dict)
self.num_layers = config.num_hidden_layers
self.model = Llama2_(config, device=device, dtype=dtype, ops=operations)
self.dtype = dtype
class Ovis25_2B(BaseLlama, torch.nn.Module):
def __init__(self, config_dict, dtype, device, operations):
super().__init__()

View File

@@ -118,8 +118,9 @@ class LTXAVTEModel(torch.nn.Module):
sdo = comfy.utils.state_dict_prefix_replace(sd, {"text_embedding_projection.aggregate_embed.weight": "text_embedding_projection.weight", "model.diffusion_model.video_embeddings_connector.": "video_embeddings_connector.", "model.diffusion_model.audio_embeddings_connector.": "audio_embeddings_connector."}, filter_keys=True)
if len(sdo) == 0:
sdo = sd
return self.load_state_dict(sdo, strict=False)
missing, unexpected = self.load_state_dict(sdo, strict=False)
missing = [k for k in missing if not k.startswith("gemma3_12b.")] # filter out keys that belong to the main gemma model
return (missing, unexpected)
def memory_estimation_function(self, token_weight_pairs, device=None):
constant = 6.0

View File

@@ -36,7 +36,7 @@ def pixart_te(dtype_t5=None, t5_quantization_metadata=None):
if t5_quantization_metadata is not None:
model_options = model_options.copy()
model_options["t5xxl_quantization_metadata"] = t5_quantization_metadata
if dtype is None:
if dtype_t5 is not None:
dtype = dtype_t5
super().__init__(device=device, dtype=dtype, model_options=model_options)
return PixArtTEModel_

View File

@@ -30,6 +30,7 @@ from torch.nn.functional import interpolate
from einops import rearrange
from comfy.cli_args import args
import json
import time
MMAP_TORCH_FILES = args.mmap_torch_files
DISABLE_MMAP = args.disable_mmap
@@ -928,7 +929,9 @@ def bislerp(samples, width, height):
return result.to(orig_dtype)
def lanczos(samples, width, height):
images = [Image.fromarray(np.clip(255. * image.movedim(0, -1).cpu().numpy(), 0, 255).astype(np.uint8)) for image in samples]
#the below API is strict and expects grayscale to be squeezed
samples = samples.squeeze(1) if samples.shape[1] == 1 else samples.movedim(1, -1)
images = [Image.fromarray(np.clip(255. * image.cpu().numpy(), 0, 255).astype(np.uint8)) for image in samples]
images = [image.resize((width, height), resample=Image.Resampling.LANCZOS) for image in images]
images = [torch.from_numpy(np.array(image).astype(np.float32) / 255.0).movedim(-1, 0) for image in images]
result = torch.stack(images)
@@ -1097,6 +1100,10 @@ def set_progress_bar_global_hook(function):
global PROGRESS_BAR_HOOK
PROGRESS_BAR_HOOK = function
# Throttle settings for progress bar updates to reduce WebSocket flooding
PROGRESS_THROTTLE_MIN_INTERVAL = 0.1 # 100ms minimum between updates
PROGRESS_THROTTLE_MIN_PERCENT = 0.5 # 0.5% minimum progress change
class ProgressBar:
def __init__(self, total, node_id=None):
global PROGRESS_BAR_HOOK
@@ -1104,6 +1111,8 @@ class ProgressBar:
self.current = 0
self.hook = PROGRESS_BAR_HOOK
self.node_id = node_id
self._last_update_time = 0.0
self._last_sent_value = -1
def update_absolute(self, value, total=None, preview=None):
if total is not None:
@@ -1112,7 +1121,29 @@ class ProgressBar:
value = self.total
self.current = value
if self.hook is not None:
self.hook(self.current, self.total, preview, node_id=self.node_id)
current_time = time.perf_counter()
is_first = (self._last_sent_value < 0)
is_final = (value >= self.total)
has_preview = (preview is not None)
# Always send immediately for previews, first update, or final update
if has_preview or is_first or is_final:
self.hook(self.current, self.total, preview, node_id=self.node_id)
self._last_update_time = current_time
self._last_sent_value = value
return
# Apply throttling for regular progress updates
if self.total > 0:
percent_changed = ((value - max(0, self._last_sent_value)) / self.total) * 100
else:
percent_changed = 100
time_elapsed = current_time - self._last_update_time
if time_elapsed >= PROGRESS_THROTTLE_MIN_INTERVAL and percent_changed >= PROGRESS_THROTTLE_MIN_PERCENT:
self.hook(self.current, self.total, preview, node_id=self.node_id)
self._last_update_time = current_time
self._last_sent_value = value
def update(self, value):
self.update_absolute(self.current + value)

View File

@@ -7,7 +7,7 @@ from comfy_api.internal.singleton import ProxiedSingleton
from comfy_api.internal.async_to_sync import create_sync_class
from ._input import ImageInput, AudioInput, MaskInput, LatentInput, VideoInput
from ._input_impl import VideoFromFile, VideoFromComponents
from ._util import VideoCodec, VideoContainer, VideoComponents, MESH, VOXEL
from ._util import VideoCodec, VideoContainer, VideoComponents, VideoSpeedPreset, MESH, VOXEL
from . import _io_public as io
from . import _ui_public as ui
from comfy_execution.utils import get_executing_context
@@ -103,6 +103,7 @@ class Types:
VideoCodec = VideoCodec
VideoContainer = VideoContainer
VideoComponents = VideoComponents
VideoSpeedPreset = VideoSpeedPreset
MESH = MESH
VOXEL = VOXEL

View File

@@ -10,7 +10,7 @@ import json
import numpy as np
import math
import torch
from .._util import VideoContainer, VideoCodec, VideoComponents
from .._util import VideoContainer, VideoCodec, VideoComponents, VideoSpeedPreset, quality_to_crf
def container_to_output_format(container_format: str | None) -> str | None:
@@ -250,10 +250,16 @@ class VideoFromFile(VideoInput):
path: str | io.BytesIO,
format: VideoContainer = VideoContainer.AUTO,
codec: VideoCodec = VideoCodec.AUTO,
metadata: Optional[dict] = None
metadata: Optional[dict] = None,
quality: Optional[int] = None,
speed: Optional[VideoSpeedPreset] = None,
profile: Optional[str] = None,
tune: Optional[str] = None,
row_mt: bool = True,
tile_columns: Optional[int] = None,
):
if isinstance(self.__file, io.BytesIO):
self.__file.seek(0) # Reset the BytesIO object to the beginning
self.__file.seek(0)
with av.open(self.__file, mode='r') as container:
container_format = container.format.name
video_encoding = container.streams.video[0].codec.name if len(container.streams.video) > 0 else None
@@ -262,6 +268,10 @@ class VideoFromFile(VideoInput):
reuse_streams = False
if codec != VideoCodec.AUTO and codec != video_encoding and video_encoding is not None:
reuse_streams = False
if quality is not None or speed is not None:
reuse_streams = False
if profile is not None or tune is not None or tile_columns is not None:
reuse_streams = False
if not reuse_streams:
components = self.get_components_internal(container)
@@ -270,7 +280,13 @@ class VideoFromFile(VideoInput):
path,
format=format,
codec=codec,
metadata=metadata
metadata=metadata,
quality=quality,
speed=speed,
profile=profile,
tune=tune,
row_mt=row_mt,
tile_columns=tile_columns,
)
streams = container.streams
@@ -330,54 +346,126 @@ class VideoFromComponents(VideoInput):
path: str,
format: VideoContainer = VideoContainer.AUTO,
codec: VideoCodec = VideoCodec.AUTO,
metadata: Optional[dict] = None
metadata: Optional[dict] = None,
quality: Optional[int] = None,
speed: Optional[VideoSpeedPreset] = None,
profile: Optional[str] = None,
tune: Optional[str] = None,
row_mt: bool = True,
tile_columns: Optional[int] = None,
):
if format != VideoContainer.AUTO and format != VideoContainer.MP4:
raise ValueError("Only MP4 format is supported for now")
if codec != VideoCodec.AUTO and codec != VideoCodec.H264:
raise ValueError("Only H264 codec is supported for now")
extra_kwargs = {}
if isinstance(format, VideoContainer) and format != VideoContainer.AUTO:
extra_kwargs["format"] = format.value
with av.open(path, mode='w', options={'movflags': 'use_metadata_tags'}, **extra_kwargs) as output:
# Add metadata before writing any streams
"""
Save video to file with optional encoding parameters.
Args:
path: Output file path
format: Container format (mp4, webm, or auto)
codec: Video codec (h264, vp9, or auto)
metadata: Optional metadata dict to embed
quality: Quality percentage 0-100 (100=best). Maps to CRF internally.
speed: Encoding speed preset. Slower = better compression.
profile: H.264 profile (baseline, main, high)
tune: H.264 tune option (film, animation, grain, etc.)
row_mt: VP9 row-based multi-threading
tile_columns: VP9 tile columns (power of 2)
"""
resolved_format = format
resolved_codec = codec
if resolved_format == VideoContainer.AUTO:
resolved_format = VideoContainer.MP4
if resolved_codec == VideoCodec.AUTO:
if resolved_format == VideoContainer.WEBM:
resolved_codec = VideoCodec.VP9
else:
resolved_codec = VideoCodec.H264
if resolved_format == VideoContainer.WEBM and resolved_codec == VideoCodec.H264:
raise ValueError("H264 codec is not supported with WebM container")
if resolved_format == VideoContainer.MP4 and resolved_codec == VideoCodec.VP9:
raise ValueError("VP9 codec is not supported with MP4 container")
codec_map = {
VideoCodec.H264: "libx264",
VideoCodec.VP9: "libvpx-vp9",
}
if resolved_codec not in codec_map:
raise ValueError(f"Unsupported codec: {resolved_codec}")
ffmpeg_codec = codec_map[resolved_codec]
extra_kwargs = {"format": resolved_format.value}
container_options = {}
if resolved_format == VideoContainer.MP4:
container_options["movflags"] = "use_metadata_tags"
with av.open(path, mode='w', options=container_options, **extra_kwargs) as output:
if metadata is not None:
for key, value in metadata.items():
output.metadata[key] = json.dumps(value)
frame_rate = Fraction(round(self.__components.frame_rate * 1000), 1000)
# Create a video stream
video_stream = output.add_stream('h264', rate=frame_rate)
video_stream = output.add_stream(ffmpeg_codec, rate=frame_rate)
video_stream.width = self.__components.images.shape[2]
video_stream.height = self.__components.images.shape[1]
video_stream.pix_fmt = 'yuv420p'
# Create an audio stream
video_stream.pix_fmt = 'yuv420p'
if resolved_codec == VideoCodec.VP9:
video_stream.bit_rate = 0
if quality is not None:
crf = quality_to_crf(quality, ffmpeg_codec)
video_stream.options['crf'] = str(crf)
if speed is not None and speed != VideoSpeedPreset.AUTO:
if isinstance(speed, str):
speed = VideoSpeedPreset(speed)
preset = speed.to_ffmpeg_preset(ffmpeg_codec)
if resolved_codec == VideoCodec.VP9:
video_stream.options['cpu-used'] = preset
else:
video_stream.options['preset'] = preset
# H.264-specific options
if resolved_codec == VideoCodec.H264:
if profile is not None:
video_stream.options['profile'] = profile
if tune is not None:
video_stream.options['tune'] = tune
# VP9-specific options
if resolved_codec == VideoCodec.VP9:
if row_mt:
video_stream.options['row-mt'] = '1'
if tile_columns is not None:
video_stream.options['tile-columns'] = str(tile_columns)
audio_sample_rate = 1
audio_stream: Optional[av.AudioStream] = None
if self.__components.audio:
audio_sample_rate = int(self.__components.audio['sample_rate'])
audio_stream = output.add_stream('aac', rate=audio_sample_rate)
audio_codec = 'libopus' if resolved_format == VideoContainer.WEBM else 'aac'
audio_stream = output.add_stream(audio_codec, rate=audio_sample_rate)
# Encode video
for i, frame in enumerate(self.__components.images):
img = (frame * 255).clamp(0, 255).byte().cpu().numpy() # shape: (H, W, 3)
frame = av.VideoFrame.from_ndarray(img, format='rgb24')
frame = frame.reformat(format='yuv420p') # Convert to YUV420P as required by h264
packet = video_stream.encode(frame)
img = (frame * 255).clamp(0, 255).byte().cpu().numpy()
video_frame = av.VideoFrame.from_ndarray(img, format='rgb24')
video_frame = video_frame.reformat(format='yuv420p')
packet = video_stream.encode(video_frame)
output.mux(packet)
# Flush video
packet = video_stream.encode(None)
output.mux(packet)
if audio_stream and self.__components.audio:
waveform = self.__components.audio['waveform']
waveform = waveform[:, :, :math.ceil((audio_sample_rate / frame_rate) * self.__components.images.shape[0])]
frame = av.AudioFrame.from_ndarray(waveform.movedim(2, 1).reshape(1, -1).float().numpy(), format='flt', layout='mono' if waveform.shape[1] == 1 else 'stereo')
frame.sample_rate = audio_sample_rate
frame.pts = 0
output.mux(audio_stream.encode(frame))
# Flush encoder
audio_frame = av.AudioFrame.from_ndarray(
waveform.movedim(2, 1).reshape(1, -1).float().numpy(),
format='flt',
layout='mono' if waveform.shape[1] == 1 else 'stereo'
)
audio_frame.sample_rate = audio_sample_rate
audio_frame.pts = 0
output.mux(audio_stream.encode(audio_frame))
output.mux(audio_stream.encode(None))

View File

@@ -153,7 +153,7 @@ class Input(_IO_V3):
'''
Base class for a V3 Input.
'''
def __init__(self, id: str, display_name: str=None, optional=False, tooltip: str=None, lazy: bool=None, extra_dict=None, raw_link: bool=None):
def __init__(self, id: str, display_name: str=None, optional=False, tooltip: str=None, lazy: bool=None, extra_dict=None, raw_link: bool=None, advanced: bool=None):
super().__init__()
self.id = id
self.display_name = display_name
@@ -162,6 +162,7 @@ class Input(_IO_V3):
self.lazy = lazy
self.extra_dict = extra_dict if extra_dict is not None else {}
self.rawLink = raw_link
self.advanced = advanced
def as_dict(self):
return prune_dict({
@@ -170,6 +171,7 @@ class Input(_IO_V3):
"tooltip": self.tooltip,
"lazy": self.lazy,
"rawLink": self.rawLink,
"advanced": self.advanced,
}) | prune_dict(self.extra_dict)
def get_io_type(self):
@@ -184,8 +186,8 @@ class WidgetInput(Input):
'''
def __init__(self, id: str, display_name: str=None, optional=False, tooltip: str=None, lazy: bool=None,
default: Any=None,
socketless: bool=None, widget_type: str=None, force_input: bool=None, extra_dict=None, raw_link: bool=None):
super().__init__(id, display_name, optional, tooltip, lazy, extra_dict, raw_link)
socketless: bool=None, widget_type: str=None, force_input: bool=None, extra_dict=None, raw_link: bool=None, advanced: bool=None):
super().__init__(id, display_name, optional, tooltip, lazy, extra_dict, raw_link, advanced)
self.default = default
self.socketless = socketless
self.widget_type = widget_type
@@ -242,8 +244,8 @@ class Boolean(ComfyTypeIO):
'''Boolean input.'''
def __init__(self, id: str, display_name: str=None, optional=False, tooltip: str=None, lazy: bool=None,
default: bool=None, label_on: str=None, label_off: str=None,
socketless: bool=None, force_input: bool=None, extra_dict=None, raw_link: bool=None):
super().__init__(id, display_name, optional, tooltip, lazy, default, socketless, None, force_input, extra_dict, raw_link)
socketless: bool=None, force_input: bool=None, extra_dict=None, raw_link: bool=None, advanced: bool=None):
super().__init__(id, display_name, optional, tooltip, lazy, default, socketless, None, force_input, extra_dict, raw_link, advanced)
self.label_on = label_on
self.label_off = label_off
self.default: bool
@@ -262,8 +264,8 @@ class Int(ComfyTypeIO):
'''Integer input.'''
def __init__(self, id: str, display_name: str=None, optional=False, tooltip: str=None, lazy: bool=None,
default: int=None, min: int=None, max: int=None, step: int=None, control_after_generate: bool=None,
display_mode: NumberDisplay=None, socketless: bool=None, force_input: bool=None, extra_dict=None, raw_link: bool=None):
super().__init__(id, display_name, optional, tooltip, lazy, default, socketless, None, force_input, extra_dict, raw_link)
display_mode: NumberDisplay=None, socketless: bool=None, force_input: bool=None, extra_dict=None, raw_link: bool=None, advanced: bool=None):
super().__init__(id, display_name, optional, tooltip, lazy, default, socketless, None, force_input, extra_dict, raw_link, advanced)
self.min = min
self.max = max
self.step = step
@@ -288,8 +290,8 @@ class Float(ComfyTypeIO):
'''Float input.'''
def __init__(self, id: str, display_name: str=None, optional=False, tooltip: str=None, lazy: bool=None,
default: float=None, min: float=None, max: float=None, step: float=None, round: float=None,
display_mode: NumberDisplay=None, socketless: bool=None, force_input: bool=None, extra_dict=None, raw_link: bool=None):
super().__init__(id, display_name, optional, tooltip, lazy, default, socketless, None, force_input, extra_dict, raw_link)
display_mode: NumberDisplay=None, socketless: bool=None, force_input: bool=None, extra_dict=None, raw_link: bool=None, advanced: bool=None):
super().__init__(id, display_name, optional, tooltip, lazy, default, socketless, None, force_input, extra_dict, raw_link, advanced)
self.min = min
self.max = max
self.step = step
@@ -314,8 +316,8 @@ class String(ComfyTypeIO):
'''String input.'''
def __init__(self, id: str, display_name: str=None, optional=False, tooltip: str=None, lazy: bool=None,
multiline=False, placeholder: str=None, default: str=None, dynamic_prompts: bool=None,
socketless: bool=None, force_input: bool=None, extra_dict=None, raw_link: bool=None):
super().__init__(id, display_name, optional, tooltip, lazy, default, socketless, None, force_input, extra_dict, raw_link)
socketless: bool=None, force_input: bool=None, extra_dict=None, raw_link: bool=None, advanced: bool=None):
super().__init__(id, display_name, optional, tooltip, lazy, default, socketless, None, force_input, extra_dict, raw_link, advanced)
self.multiline = multiline
self.placeholder = placeholder
self.dynamic_prompts = dynamic_prompts
@@ -350,12 +352,13 @@ class Combo(ComfyTypeIO):
socketless: bool=None,
extra_dict=None,
raw_link: bool=None,
advanced: bool=None,
):
if isinstance(options, type) and issubclass(options, Enum):
options = [v.value for v in options]
if isinstance(default, Enum):
default = default.value
super().__init__(id, display_name, optional, tooltip, lazy, default, socketless, None, None, extra_dict, raw_link)
super().__init__(id, display_name, optional, tooltip, lazy, default, socketless, None, None, extra_dict, raw_link, advanced)
self.multiselect = False
self.options = options
self.control_after_generate = control_after_generate
@@ -387,8 +390,8 @@ class MultiCombo(ComfyTypeI):
class Input(Combo.Input):
def __init__(self, id: str, options: list[str], display_name: str=None, optional=False, tooltip: str=None, lazy: bool=None,
default: list[str]=None, placeholder: str=None, chip: bool=None, control_after_generate: bool=None,
socketless: bool=None, extra_dict=None, raw_link: bool=None):
super().__init__(id, options, display_name, optional, tooltip, lazy, default, control_after_generate, socketless=socketless, extra_dict=extra_dict, raw_link=raw_link)
socketless: bool=None, extra_dict=None, raw_link: bool=None, advanced: bool=None):
super().__init__(id, options, display_name, optional, tooltip, lazy, default, control_after_generate, socketless=socketless, extra_dict=extra_dict, raw_link=raw_link, advanced=advanced)
self.multiselect = True
self.placeholder = placeholder
self.chip = chip
@@ -421,9 +424,9 @@ class Webcam(ComfyTypeIO):
Type = str
def __init__(
self, id: str, display_name: str=None, optional=False,
tooltip: str=None, lazy: bool=None, default: str=None, socketless: bool=None, extra_dict=None, raw_link: bool=None
tooltip: str=None, lazy: bool=None, default: str=None, socketless: bool=None, extra_dict=None, raw_link: bool=None, advanced: bool=None
):
super().__init__(id, display_name, optional, tooltip, lazy, default, socketless, None, None, extra_dict, raw_link)
super().__init__(id, display_name, optional, tooltip, lazy, default, socketless, None, None, extra_dict, raw_link, advanced)
@comfytype(io_type="MASK")
@@ -776,7 +779,7 @@ class MultiType:
'''
Input that permits more than one input type; if `id` is an instance of `ComfyType.Input`, then that input will be used to create a widget (if applicable) with overridden values.
'''
def __init__(self, id: str | Input, types: list[type[_ComfyType] | _ComfyType], display_name: str=None, optional=False, tooltip: str=None, lazy: bool=None, extra_dict=None, raw_link: bool=None):
def __init__(self, id: str | Input, types: list[type[_ComfyType] | _ComfyType], display_name: str=None, optional=False, tooltip: str=None, lazy: bool=None, extra_dict=None, raw_link: bool=None, advanced: bool=None):
# if id is an Input, then use that Input with overridden values
self.input_override = None
if isinstance(id, Input):
@@ -789,7 +792,7 @@ class MultiType:
# if is a widget input, make sure widget_type is set appropriately
if isinstance(self.input_override, WidgetInput):
self.input_override.widget_type = self.input_override.get_io_type()
super().__init__(id, display_name, optional, tooltip, lazy, extra_dict, raw_link)
super().__init__(id, display_name, optional, tooltip, lazy, extra_dict, raw_link, advanced)
self._io_types = types
@property
@@ -843,8 +846,8 @@ class MatchType(ComfyTypeIO):
class Input(Input):
def __init__(self, id: str, template: MatchType.Template,
display_name: str=None, optional=False, tooltip: str=None, lazy: bool=None, extra_dict=None, raw_link: bool=None):
super().__init__(id, display_name, optional, tooltip, lazy, extra_dict, raw_link)
display_name: str=None, optional=False, tooltip: str=None, lazy: bool=None, extra_dict=None, raw_link: bool=None, advanced: bool=None):
super().__init__(id, display_name, optional, tooltip, lazy, extra_dict, raw_link, advanced)
self.template = template
def as_dict(self):
@@ -1119,8 +1122,8 @@ class ImageCompare(ComfyTypeI):
class Input(WidgetInput):
def __init__(self, id: str, display_name: str=None, optional=False, tooltip: str=None,
socketless: bool=True):
super().__init__(id, display_name, optional, tooltip, None, None, socketless)
socketless: bool=True, advanced: bool=None):
super().__init__(id, display_name, optional, tooltip, None, None, socketless, None, None, None, None, advanced)
def as_dict(self):
return super().as_dict()
@@ -1225,6 +1228,7 @@ class NodeInfoV1:
deprecated: bool=None
experimental: bool=None
api_node: bool=None
price_badge: dict | None = None
@dataclass
class NodeInfoV3:
@@ -1234,11 +1238,77 @@ class NodeInfoV3:
name: str=None
display_name: str=None
description: str=None
python_module: Any = None
category: str=None
output_node: bool=None
deprecated: bool=None
experimental: bool=None
api_node: bool=None
price_badge: dict | None = None
@dataclass
class PriceBadgeDepends:
widgets: list[str] = field(default_factory=list)
inputs: list[str] = field(default_factory=list)
input_groups: list[str] = field(default_factory=list)
def validate(self) -> None:
if not isinstance(self.widgets, list) or any(not isinstance(x, str) for x in self.widgets):
raise ValueError("PriceBadgeDepends.widgets must be a list[str].")
if not isinstance(self.inputs, list) or any(not isinstance(x, str) for x in self.inputs):
raise ValueError("PriceBadgeDepends.inputs must be a list[str].")
if not isinstance(self.input_groups, list) or any(not isinstance(x, str) for x in self.input_groups):
raise ValueError("PriceBadgeDepends.input_groups must be a list[str].")
def as_dict(self, schema_inputs: list["Input"]) -> dict[str, Any]:
# Build lookup: widget_id -> io_type
input_types: dict[str, str] = {}
for inp in schema_inputs:
all_inputs = inp.get_all()
input_types[inp.id] = inp.get_io_type() # First input is always the parent itself
for nested_inp in all_inputs[1:]:
# For DynamicCombo/DynamicSlot, nested inputs are prefixed with parent ID
# to match frontend naming convention (e.g., "should_texture.enable_pbr")
prefixed_id = f"{inp.id}.{nested_inp.id}"
input_types[prefixed_id] = nested_inp.get_io_type()
# Enrich widgets with type information, raising error for unknown widgets
widgets_data: list[dict[str, str]] = []
for w in self.widgets:
if w not in input_types:
raise ValueError(
f"PriceBadge depends_on.widgets references unknown widget '{w}'. "
f"Available widgets: {list(input_types.keys())}"
)
widgets_data.append({"name": w, "type": input_types[w]})
return {
"widgets": widgets_data,
"inputs": self.inputs,
"input_groups": self.input_groups,
}
@dataclass
class PriceBadge:
expr: str
depends_on: PriceBadgeDepends = field(default_factory=PriceBadgeDepends)
engine: str = field(default="jsonata")
def validate(self) -> None:
if self.engine != "jsonata":
raise ValueError(f"Unsupported PriceBadge.engine '{self.engine}'. Only 'jsonata' is supported.")
if not isinstance(self.expr, str) or not self.expr.strip():
raise ValueError("PriceBadge.expr must be a non-empty string.")
self.depends_on.validate()
def as_dict(self, schema_inputs: list["Input"]) -> dict[str, Any]:
return {
"engine": self.engine,
"depends_on": self.depends_on.as_dict(schema_inputs),
"expr": self.expr,
}
@dataclass
@@ -1284,6 +1354,8 @@ class Schema:
"""Flags a node as experimental, informing users that it may change or not work as expected."""
is_api_node: bool=False
"""Flags a node as an API node. See: https://docs.comfy.org/tutorials/api-nodes/overview."""
price_badge: PriceBadge | None = None
"""Optional client-evaluated pricing badge declaration for this node."""
not_idempotent: bool=False
"""Flags a node as not idempotent; when True, the node will run and not reuse the cached outputs when identical inputs are provided on a different node in the graph."""
enable_expand: bool=False
@@ -1314,6 +1386,8 @@ class Schema:
input.validate()
for output in self.outputs:
output.validate()
if self.price_badge is not None:
self.price_badge.validate()
def finalize(self):
"""Add hidden based on selected schema options, and give outputs without ids default ids."""
@@ -1387,7 +1461,8 @@ class Schema:
deprecated=self.is_deprecated,
experimental=self.is_experimental,
api_node=self.is_api_node,
python_module=getattr(cls, "RELATIVE_PYTHON_MODULE", "nodes")
python_module=getattr(cls, "RELATIVE_PYTHON_MODULE", "nodes"),
price_badge=self.price_badge.as_dict(self.inputs) if self.price_badge is not None else None,
)
return info
@@ -1419,7 +1494,8 @@ class Schema:
deprecated=self.is_deprecated,
experimental=self.is_experimental,
api_node=self.is_api_node,
python_module=getattr(cls, "RELATIVE_PYTHON_MODULE", "nodes")
python_module=getattr(cls, "RELATIVE_PYTHON_MODULE", "nodes"),
price_badge=self.price_badge.as_dict(self.inputs) if self.price_badge is not None else None,
)
return info
@@ -1971,4 +2047,6 @@ __all__ = [
"add_to_dict_v3",
"V3Data",
"ImageCompare",
"PriceBadgeDepends",
"PriceBadge",
]

View File

@@ -1,4 +1,4 @@
from .video_types import VideoContainer, VideoCodec, VideoComponents
from .video_types import VideoContainer, VideoCodec, VideoComponents, VideoSpeedPreset, quality_to_crf
from .geometry_types import VOXEL, MESH
from .image_types import SVG
@@ -7,6 +7,8 @@ __all__ = [
"VideoContainer",
"VideoCodec",
"VideoComponents",
"VideoSpeedPreset",
"quality_to_crf",
"VOXEL",
"MESH",
"SVG",

View File

@@ -8,6 +8,7 @@ from .._input import ImageInput, AudioInput
class VideoCodec(str, Enum):
AUTO = "auto"
H264 = "h264"
VP9 = "vp9"
@classmethod
def as_input(cls) -> list[str]:
@@ -16,9 +17,11 @@ class VideoCodec(str, Enum):
"""
return [member.value for member in cls]
class VideoContainer(str, Enum):
AUTO = "auto"
MP4 = "mp4"
WEBM = "webm"
@classmethod
def as_input(cls) -> list[str]:
@@ -36,8 +39,71 @@ class VideoContainer(str, Enum):
value = cls(value)
if value == VideoContainer.MP4 or value == VideoContainer.AUTO:
return "mp4"
if value == VideoContainer.WEBM:
return "webm"
return ""
class VideoSpeedPreset(str, Enum):
"""Encoding speed presets - slower = better compression at same quality."""
AUTO = "auto"
FASTEST = "Fastest"
FAST = "Fast"
BALANCED = "Balanced"
QUALITY = "Quality"
BEST = "Best"
@classmethod
def as_input(cls) -> list[str]:
return [member.value for member in cls]
def to_ffmpeg_preset(self, codec: str = "h264") -> str:
"""Convert to FFmpeg preset string for the given codec."""
h264_map = {
VideoSpeedPreset.FASTEST: "ultrafast",
VideoSpeedPreset.FAST: "veryfast",
VideoSpeedPreset.BALANCED: "medium",
VideoSpeedPreset.QUALITY: "slow",
VideoSpeedPreset.BEST: "veryslow",
VideoSpeedPreset.AUTO: "medium",
}
vp9_map = {
VideoSpeedPreset.FASTEST: "0",
VideoSpeedPreset.FAST: "1",
VideoSpeedPreset.BALANCED: "2",
VideoSpeedPreset.QUALITY: "3",
VideoSpeedPreset.BEST: "4",
VideoSpeedPreset.AUTO: "2",
}
if codec in ("vp9", "libvpx-vp9"):
return vp9_map.get(self, "2")
return h264_map.get(self, "medium")
def quality_to_crf(quality: int, codec: str = "h264") -> int:
"""
Map 0-100 quality percentage to codec-appropriate CRF value.
Args:
quality: 0-100 where 100 is best quality
codec: The codec being used (h264, vp9, etc.)
Returns:
CRF value appropriate for the codec
"""
quality = max(0, min(100, quality))
if codec in ("h264", "libx264"):
# h264: CRF 0-51 (lower = better), typical range 12-40
# quality 100 → CRF 12, quality 0 → CRF 40
return int(40 - (quality / 100) * 28)
elif codec in ("vp9", "libvpx-vp9"):
# vp9: CRF 0-63 (lower = better), typical range 15-50
# quality 100 → CRF 15, quality 0 → CRF 50
return int(50 - (quality / 100) * 35)
# Default fallback
return 23
@dataclass
class VideoComponents:
"""

View File

@@ -65,11 +65,13 @@ class TaskImageContent(BaseModel):
class Text2VideoTaskCreationRequest(BaseModel):
model: str = Field(...)
content: list[TaskTextContent] = Field(..., min_length=1)
generate_audio: bool | None = Field(...)
class Image2VideoTaskCreationRequest(BaseModel):
model: str = Field(...)
content: list[TaskTextContent | TaskImageContent] = Field(..., min_length=2)
generate_audio: bool | None = Field(...)
class TaskCreationResponse(BaseModel):
@@ -141,4 +143,9 @@ VIDEO_TASKS_EXECUTION_TIME = {
"720p": 65,
"1080p": 100,
},
"seedance-1-5-pro-251215": {
"480p": 80,
"720p": 100,
"1080p": 150,
},
}

View File

@@ -0,0 +1,160 @@
from typing import TypedDict
from pydantic import BaseModel, Field
from comfy_api.latest import Input
class InputShouldRemesh(TypedDict):
should_remesh: str
topology: str
target_polycount: int
class InputShouldTexture(TypedDict):
should_texture: str
enable_pbr: bool
texture_prompt: str
texture_image: Input.Image | None
class MeshyTaskResponse(BaseModel):
result: str = Field(...)
class MeshyTextToModelRequest(BaseModel):
mode: str = Field("preview")
prompt: str = Field(..., max_length=600)
art_style: str = Field(..., description="'realistic' or 'sculpture'")
ai_model: str = Field(...)
topology: str | None = Field(..., description="'quad' or 'triangle'")
target_polycount: int | None = Field(..., ge=100, le=300000)
should_remesh: bool = Field(
True,
description="False returns the original mesh, ignoring topology and polycount.",
)
symmetry_mode: str = Field(..., description="'auto', 'off' or 'on'")
pose_mode: str = Field(...)
seed: int = Field(...)
moderation: bool = Field(False)
class MeshyRefineTask(BaseModel):
mode: str = Field("refine")
preview_task_id: str = Field(...)
enable_pbr: bool | None = Field(...)
texture_prompt: str | None = Field(...)
texture_image_url: str | None = Field(...)
ai_model: str = Field(...)
moderation: bool = Field(False)
class MeshyImageToModelRequest(BaseModel):
image_url: str = Field(...)
ai_model: str = Field(...)
topology: str | None = Field(..., description="'quad' or 'triangle'")
target_polycount: int | None = Field(..., ge=100, le=300000)
symmetry_mode: str = Field(..., description="'auto', 'off' or 'on'")
should_remesh: bool = Field(
True,
description="False returns the original mesh, ignoring topology and polycount.",
)
should_texture: bool = Field(...)
enable_pbr: bool | None = Field(...)
pose_mode: str = Field(...)
texture_prompt: str | None = Field(None, max_length=600)
texture_image_url: str | None = Field(None)
seed: int = Field(...)
moderation: bool = Field(False)
class MeshyMultiImageToModelRequest(BaseModel):
image_urls: list[str] = Field(...)
ai_model: str = Field(...)
topology: str | None = Field(..., description="'quad' or 'triangle'")
target_polycount: int | None = Field(..., ge=100, le=300000)
symmetry_mode: str = Field(..., description="'auto', 'off' or 'on'")
should_remesh: bool = Field(
True,
description="False returns the original mesh, ignoring topology and polycount.",
)
should_texture: bool = Field(...)
enable_pbr: bool | None = Field(...)
pose_mode: str = Field(...)
texture_prompt: str | None = Field(None, max_length=600)
texture_image_url: str | None = Field(None)
seed: int = Field(...)
moderation: bool = Field(False)
class MeshyRiggingRequest(BaseModel):
input_task_id: str = Field(...)
height_meters: float = Field(...)
texture_image_url: str | None = Field(...)
class MeshyAnimationRequest(BaseModel):
rig_task_id: str = Field(...)
action_id: int = Field(...)
class MeshyTextureRequest(BaseModel):
input_task_id: str = Field(...)
ai_model: str = Field(...)
enable_original_uv: bool = Field(...)
enable_pbr: bool = Field(...)
text_style_prompt: str | None = Field(...)
image_style_url: str | None = Field(...)
class MeshyModelsUrls(BaseModel):
glb: str = Field("")
class MeshyRiggedModelsUrls(BaseModel):
rigged_character_glb_url: str = Field("")
class MeshyAnimatedModelsUrls(BaseModel):
animation_glb_url: str = Field("")
class MeshyResultTextureUrls(BaseModel):
base_color: str = Field(...)
metallic: str | None = Field(None)
normal: str | None = Field(None)
roughness: str | None = Field(None)
class MeshyTaskError(BaseModel):
message: str | None = Field(None)
class MeshyModelResult(BaseModel):
id: str = Field(...)
type: str = Field(...)
model_urls: MeshyModelsUrls = Field(MeshyModelsUrls())
thumbnail_url: str = Field(...)
video_url: str | None = Field(None)
status: str = Field(...)
progress: int = Field(0)
texture_urls: list[MeshyResultTextureUrls] | None = Field([])
task_error: MeshyTaskError | None = Field(None)
class MeshyRiggedResult(BaseModel):
id: str = Field(...)
type: str = Field(...)
status: str = Field(...)
progress: int = Field(0)
result: MeshyRiggedModelsUrls = Field(MeshyRiggedModelsUrls())
task_error: MeshyTaskError | None = Field(None)
class MeshyAnimationResult(BaseModel):
id: str = Field(...)
type: str = Field(...)
status: str = Field(...)
progress: int = Field(0)
result: MeshyAnimatedModelsUrls = Field(MeshyAnimatedModelsUrls())
task_error: MeshyTaskError | None = Field(None)

View File

@@ -0,0 +1,41 @@
from pydantic import BaseModel, Field
class SubjectReference(BaseModel):
id: str = Field(...)
images: list[str] = Field(...)
class TaskCreationRequest(BaseModel):
model: str = Field(...)
prompt: str = Field(..., max_length=2000)
duration: int = Field(...)
seed: int = Field(..., ge=0, le=2147483647)
aspect_ratio: str | None = Field(None)
resolution: str | None = Field(None)
movement_amplitude: str | None = Field(None)
images: list[str] | None = Field(None, description="Base64 encoded string or image URL")
subjects: list[SubjectReference] | None = Field(None)
bgm: bool | None = Field(None)
audio: bool | None = Field(None)
class TaskCreationResponse(BaseModel):
task_id: str = Field(...)
state: str = Field(...)
created_at: str = Field(...)
code: int | None = Field(None, description="Error code")
class TaskResult(BaseModel):
id: str = Field(..., description="Creation id")
url: str = Field(..., description="The URL of the generated results, valid for one hour")
cover_url: str = Field(..., description="The cover URL of the generated results, valid for one hour")
class TaskStatusResponse(BaseModel):
state: str = Field(...)
err_code: str | None = Field(None)
progress: float | None = Field(None)
credits: int | None = Field(None)
creations: list[TaskResult] = Field(..., description="Generated results")

View File

@@ -97,6 +97,9 @@ class FluxProUltraImageNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
expr="""{"type":"usd","usd":0.06}""",
),
)
@classmethod
@@ -352,6 +355,9 @@ class FluxProExpandNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
expr="""{"type":"usd","usd":0.05}""",
),
)
@classmethod
@@ -458,6 +464,9 @@ class FluxProFillNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
expr="""{"type":"usd","usd":0.05}""",
),
)
@classmethod
@@ -511,6 +520,21 @@ class Flux2ProImageNode(IO.ComfyNode):
NODE_ID = "Flux2ProImageNode"
DISPLAY_NAME = "Flux.2 [pro] Image"
API_ENDPOINT = "/proxy/bfl/flux-2-pro/generate"
PRICE_BADGE_EXPR = """
(
$MP := 1024 * 1024;
$outMP := $max([1, $floor(((widgets.width * widgets.height) + $MP - 1) / $MP)]);
$outputCost := 0.03 + 0.015 * ($outMP - 1);
inputs.images.connected
? {
"type":"range_usd",
"min_usd": $outputCost + 0.015,
"max_usd": $outputCost + 0.12,
"format": { "approximate": true }
}
: {"type":"usd","usd": $outputCost}
)
"""
@classmethod
def define_schema(cls) -> IO.Schema:
@@ -563,6 +587,10 @@ class Flux2ProImageNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["width", "height"], inputs=["images"]),
expr=cls.PRICE_BADGE_EXPR,
),
)
@classmethod
@@ -623,6 +651,22 @@ class Flux2MaxImageNode(Flux2ProImageNode):
NODE_ID = "Flux2MaxImageNode"
DISPLAY_NAME = "Flux.2 [max] Image"
API_ENDPOINT = "/proxy/bfl/flux-2-max/generate"
PRICE_BADGE_EXPR = """
(
$MP := 1024 * 1024;
$outMP := $max([1, $floor(((widgets.width * widgets.height) + $MP - 1) / $MP)]);
$outputCost := 0.07 + 0.03 * ($outMP - 1);
inputs.images.connected
? {
"type":"range_usd",
"min_usd": $outputCost + 0.03,
"max_usd": $outputCost + 0.24,
"format": { "approximate": true }
}
: {"type":"usd","usd": $outputCost}
)
"""
class BFLExtension(ComfyExtension):

View File

@@ -126,6 +126,9 @@ class ByteDanceImageNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
expr="""{"type":"usd","usd":0.03}""",
),
)
@classmethod
@@ -367,6 +370,19 @@ class ByteDanceSeedreamNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["model"]),
expr="""
(
$price := $contains(widgets.model, "seedream-4-5-251128") ? 0.04 : 0.03;
{
"type":"usd",
"usd": $price,
"format": { "suffix":" x images/Run", "approximate": true }
}
)
""",
),
)
@classmethod
@@ -461,7 +477,12 @@ class ByteDanceTextToVideoNode(IO.ComfyNode):
inputs=[
IO.Combo.Input(
"model",
options=["seedance-1-0-pro-250528", "seedance-1-0-lite-t2v-250428", "seedance-1-0-pro-fast-251015"],
options=[
"seedance-1-5-pro-251215",
"seedance-1-0-pro-250528",
"seedance-1-0-lite-t2v-250428",
"seedance-1-0-pro-fast-251015",
],
default="seedance-1-0-pro-fast-251015",
),
IO.String.Input(
@@ -512,6 +533,12 @@ class ByteDanceTextToVideoNode(IO.ComfyNode):
tooltip='Whether to add an "AI generated" watermark to the video.',
optional=True,
),
IO.Boolean.Input(
"generate_audio",
default=False,
tooltip="This parameter is ignored for any model except seedance-1-5-pro.",
optional=True,
),
],
outputs=[
IO.Video.Output(),
@@ -522,6 +549,7 @@ class ByteDanceTextToVideoNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=PRICE_BADGE_VIDEO,
)
@classmethod
@@ -535,7 +563,10 @@ class ByteDanceTextToVideoNode(IO.ComfyNode):
seed: int,
camera_fixed: bool,
watermark: bool,
generate_audio: bool = False,
) -> IO.NodeOutput:
if model == "seedance-1-5-pro-251215" and duration < 4:
raise ValueError("Minimum supported duration for Seedance 1.5 Pro is 4 seconds.")
validate_string(prompt, strip_whitespace=True, min_length=1)
raise_if_text_params(prompt, ["resolution", "ratio", "duration", "seed", "camerafixed", "watermark"])
@@ -550,7 +581,11 @@ class ByteDanceTextToVideoNode(IO.ComfyNode):
)
return await process_video_task(
cls,
payload=Text2VideoTaskCreationRequest(model=model, content=[TaskTextContent(text=prompt)]),
payload=Text2VideoTaskCreationRequest(
model=model,
content=[TaskTextContent(text=prompt)],
generate_audio=generate_audio if model == "seedance-1-5-pro-251215" else None,
),
estimated_duration=max(1, math.ceil(VIDEO_TASKS_EXECUTION_TIME[model][resolution] * (duration / 10.0))),
)
@@ -567,7 +602,12 @@ class ByteDanceImageToVideoNode(IO.ComfyNode):
inputs=[
IO.Combo.Input(
"model",
options=["seedance-1-0-pro-250528", "seedance-1-0-lite-t2v-250428", "seedance-1-0-pro-fast-251015"],
options=[
"seedance-1-5-pro-251215",
"seedance-1-0-pro-250528",
"seedance-1-0-lite-i2v-250428",
"seedance-1-0-pro-fast-251015",
],
default="seedance-1-0-pro-fast-251015",
),
IO.String.Input(
@@ -622,6 +662,12 @@ class ByteDanceImageToVideoNode(IO.ComfyNode):
tooltip='Whether to add an "AI generated" watermark to the video.',
optional=True,
),
IO.Boolean.Input(
"generate_audio",
default=False,
tooltip="This parameter is ignored for any model except seedance-1-5-pro.",
optional=True,
),
],
outputs=[
IO.Video.Output(),
@@ -632,6 +678,7 @@ class ByteDanceImageToVideoNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=PRICE_BADGE_VIDEO,
)
@classmethod
@@ -646,7 +693,10 @@ class ByteDanceImageToVideoNode(IO.ComfyNode):
seed: int,
camera_fixed: bool,
watermark: bool,
generate_audio: bool = False,
) -> IO.NodeOutput:
if model == "seedance-1-5-pro-251215" and duration < 4:
raise ValueError("Minimum supported duration for Seedance 1.5 Pro is 4 seconds.")
validate_string(prompt, strip_whitespace=True, min_length=1)
raise_if_text_params(prompt, ["resolution", "ratio", "duration", "seed", "camerafixed", "watermark"])
validate_image_dimensions(image, min_width=300, min_height=300, max_width=6000, max_height=6000)
@@ -668,6 +718,7 @@ class ByteDanceImageToVideoNode(IO.ComfyNode):
payload=Image2VideoTaskCreationRequest(
model=model,
content=[TaskTextContent(text=prompt), TaskImageContent(image_url=TaskImageContentUrl(url=image_url))],
generate_audio=generate_audio if model == "seedance-1-5-pro-251215" else None,
),
estimated_duration=max(1, math.ceil(VIDEO_TASKS_EXECUTION_TIME[model][resolution] * (duration / 10.0))),
)
@@ -685,7 +736,7 @@ class ByteDanceFirstLastFrameNode(IO.ComfyNode):
inputs=[
IO.Combo.Input(
"model",
options=["seedance-1-0-pro-250528", "seedance-1-0-lite-i2v-250428"],
options=["seedance-1-5-pro-251215", "seedance-1-0-pro-250528", "seedance-1-0-lite-i2v-250428"],
default="seedance-1-0-lite-i2v-250428",
),
IO.String.Input(
@@ -744,6 +795,12 @@ class ByteDanceFirstLastFrameNode(IO.ComfyNode):
tooltip='Whether to add an "AI generated" watermark to the video.',
optional=True,
),
IO.Boolean.Input(
"generate_audio",
default=False,
tooltip="This parameter is ignored for any model except seedance-1-5-pro.",
optional=True,
),
],
outputs=[
IO.Video.Output(),
@@ -754,6 +811,7 @@ class ByteDanceFirstLastFrameNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=PRICE_BADGE_VIDEO,
)
@classmethod
@@ -769,7 +827,10 @@ class ByteDanceFirstLastFrameNode(IO.ComfyNode):
seed: int,
camera_fixed: bool,
watermark: bool,
generate_audio: bool = False,
) -> IO.NodeOutput:
if model == "seedance-1-5-pro-251215" and duration < 4:
raise ValueError("Minimum supported duration for Seedance 1.5 Pro is 4 seconds.")
validate_string(prompt, strip_whitespace=True, min_length=1)
raise_if_text_params(prompt, ["resolution", "ratio", "duration", "seed", "camerafixed", "watermark"])
for i in (first_frame, last_frame):
@@ -802,6 +863,7 @@ class ByteDanceFirstLastFrameNode(IO.ComfyNode):
TaskImageContent(image_url=TaskImageContentUrl(url=str(download_urls[0])), role="first_frame"),
TaskImageContent(image_url=TaskImageContentUrl(url=str(download_urls[1])), role="last_frame"),
],
generate_audio=generate_audio if model == "seedance-1-5-pro-251215" else None,
),
estimated_duration=max(1, math.ceil(VIDEO_TASKS_EXECUTION_TIME[model][resolution] * (duration / 10.0))),
)
@@ -877,6 +939,41 @@ class ByteDanceImageReferenceNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["model", "duration", "resolution"]),
expr="""
(
$priceByModel := {
"seedance-1-0-pro": {
"480p":[0.23,0.24],
"720p":[0.51,0.56]
},
"seedance-1-0-lite": {
"480p":[0.17,0.18],
"720p":[0.37,0.41]
}
};
$model := widgets.model;
$modelKey :=
$contains($model, "seedance-1-0-pro") ? "seedance-1-0-pro" :
"seedance-1-0-lite";
$resolution := widgets.resolution;
$resKey :=
$contains($resolution, "720") ? "720p" :
"480p";
$modelPrices := $lookup($priceByModel, $modelKey);
$baseRange := $lookup($modelPrices, $resKey);
$min10s := $baseRange[0];
$max10s := $baseRange[1];
$scale := widgets.duration / 10;
$minCost := $min10s * $scale;
$maxCost := $max10s * $scale;
($minCost = $maxCost)
? {"type":"usd","usd": $minCost}
: {"type":"range_usd","min_usd": $minCost, "max_usd": $maxCost}
)
""",
),
)
@classmethod
@@ -946,6 +1043,59 @@ def raise_if_text_params(prompt: str, text_params: list[str]) -> None:
)
PRICE_BADGE_VIDEO = IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["model", "duration", "resolution", "generate_audio"]),
expr="""
(
$priceByModel := {
"seedance-1-5-pro": {
"480p":[0.12,0.12],
"720p":[0.26,0.26],
"1080p":[0.58,0.59]
},
"seedance-1-0-pro": {
"480p":[0.23,0.24],
"720p":[0.51,0.56],
"1080p":[1.18,1.22]
},
"seedance-1-0-pro-fast": {
"480p":[0.09,0.1],
"720p":[0.21,0.23],
"1080p":[0.47,0.49]
},
"seedance-1-0-lite": {
"480p":[0.17,0.18],
"720p":[0.37,0.41],
"1080p":[0.85,0.88]
}
};
$model := widgets.model;
$modelKey :=
$contains($model, "seedance-1-5-pro") ? "seedance-1-5-pro" :
$contains($model, "seedance-1-0-pro-fast") ? "seedance-1-0-pro-fast" :
$contains($model, "seedance-1-0-pro") ? "seedance-1-0-pro" :
"seedance-1-0-lite";
$resolution := widgets.resolution;
$resKey :=
$contains($resolution, "1080") ? "1080p" :
$contains($resolution, "720") ? "720p" :
"480p";
$modelPrices := $lookup($priceByModel, $modelKey);
$baseRange := $lookup($modelPrices, $resKey);
$min10s := $baseRange[0];
$max10s := $baseRange[1];
$scale := widgets.duration / 10;
$audioMultiplier := ($modelKey = "seedance-1-5-pro" and widgets.generate_audio) ? 2 : 1;
$minCost := $min10s * $scale * $audioMultiplier;
$maxCost := $max10s * $scale * $audioMultiplier;
($minCost = $maxCost)
? {"type":"usd","usd": $minCost, "format": { "approximate": true }}
: {"type":"range_usd","min_usd": $minCost, "max_usd": $maxCost, "format": { "approximate": true }}
)
""",
)
class ByteDanceExtension(ComfyExtension):
@override
async def get_node_list(self) -> list[type[IO.ComfyNode]]:

View File

@@ -130,7 +130,7 @@ def get_parts_by_type(response: GeminiGenerateContentResponse, part_type: Litera
Returns:
List of response parts matching the requested type.
"""
if response.candidates is None:
if not response.candidates:
if response.promptFeedback and response.promptFeedback.blockReason:
feedback = response.promptFeedback
raise ValueError(
@@ -141,14 +141,24 @@ def get_parts_by_type(response: GeminiGenerateContentResponse, part_type: Litera
"try changing it to `IMAGE+TEXT` to view the model's reasoning and understand why image generation failed."
)
parts = []
for part in response.candidates[0].content.parts:
if part_type == "text" and part.text:
parts.append(part)
elif part.inlineData and part.inlineData.mimeType == part_type:
parts.append(part)
elif part.fileData and part.fileData.mimeType == part_type:
parts.append(part)
# Skip parts that don't match the requested type
blocked_reasons = []
for candidate in response.candidates:
if candidate.finishReason and candidate.finishReason.upper() == "IMAGE_PROHIBITED_CONTENT":
blocked_reasons.append(candidate.finishReason)
continue
if candidate.content is None or candidate.content.parts is None:
continue
for part in candidate.content.parts:
if part_type == "text" and part.text:
parts.append(part)
elif part.inlineData and part.inlineData.mimeType == part_type:
parts.append(part)
elif part.fileData and part.fileData.mimeType == part_type:
parts.append(part)
if not parts and blocked_reasons:
raise ValueError(f"Gemini API blocked the request. Reasons: {blocked_reasons}")
return parts
@@ -309,6 +319,30 @@ class GeminiNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["model"]),
expr="""
(
$m := widgets.model;
$contains($m, "gemini-2.5-flash") ? {
"type": "list_usd",
"usd": [0.0003, 0.0025],
"format": { "approximate": true, "separator": "-", "suffix": " per 1K tokens"}
}
: $contains($m, "gemini-2.5-pro") ? {
"type": "list_usd",
"usd": [0.00125, 0.01],
"format": { "approximate": true, "separator": "-", "suffix": " per 1K tokens" }
}
: $contains($m, "gemini-3-pro-preview") ? {
"type": "list_usd",
"usd": [0.002, 0.012],
"format": { "approximate": true, "separator": "-", "suffix": " per 1K tokens" }
}
: {"type":"text", "text":"Token-based"}
)
""",
),
)
@classmethod
@@ -570,6 +604,9 @@ class GeminiImage(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
expr="""{"type":"usd","usd":0.039,"format":{"suffix":"/Image (1K)","approximate":true}}""",
),
)
@classmethod
@@ -700,6 +737,19 @@ class GeminiImage2(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["resolution"]),
expr="""
(
$r := widgets.resolution;
($contains($r,"1k") or $contains($r,"2k"))
? {"type":"usd","usd":0.134,"format":{"suffix":"/Image","approximate":true}}
: $contains($r,"4k")
? {"type":"usd","usd":0.24,"format":{"suffix":"/Image","approximate":true}}
: {"type":"text","text":"Token-based"}
)
""",
),
)
@classmethod

View File

@@ -236,7 +236,6 @@ class IdeogramV1(IO.ComfyNode):
display_name="Ideogram V1",
category="api node/image/Ideogram",
description="Generates images using the Ideogram V1 model.",
is_api_node=True,
inputs=[
IO.String.Input(
"prompt",
@@ -298,6 +297,17 @@ class IdeogramV1(IO.ComfyNode):
IO.Hidden.api_key_comfy_org,
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["num_images", "turbo"]),
expr="""
(
$n := widgets.num_images;
$base := (widgets.turbo = true) ? 0.0286 : 0.0858;
{"type":"usd","usd": $round($base * $n, 2)}
)
""",
),
)
@classmethod
@@ -351,7 +361,6 @@ class IdeogramV2(IO.ComfyNode):
display_name="Ideogram V2",
category="api node/image/Ideogram",
description="Generates images using the Ideogram V2 model.",
is_api_node=True,
inputs=[
IO.String.Input(
"prompt",
@@ -436,6 +445,17 @@ class IdeogramV2(IO.ComfyNode):
IO.Hidden.api_key_comfy_org,
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["num_images", "turbo"]),
expr="""
(
$n := widgets.num_images;
$base := (widgets.turbo = true) ? 0.0715 : 0.1144;
{"type":"usd","usd": $round($base * $n, 2)}
)
""",
),
)
@classmethod
@@ -506,7 +526,6 @@ class IdeogramV3(IO.ComfyNode):
category="api node/image/Ideogram",
description="Generates images using the Ideogram V3 model. "
"Supports both regular image generation from text prompts and image editing with mask.",
is_api_node=True,
inputs=[
IO.String.Input(
"prompt",
@@ -591,6 +610,23 @@ class IdeogramV3(IO.ComfyNode):
IO.Hidden.api_key_comfy_org,
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["rendering_speed", "num_images"], inputs=["character_image"]),
expr="""
(
$n := widgets.num_images;
$speed := widgets.rendering_speed;
$hasChar := inputs.character_image.connected;
$base :=
$contains($speed,"quality") ? ($hasChar ? 0.286 : 0.1287) :
$contains($speed,"default") ? ($hasChar ? 0.2145 : 0.0858) :
$contains($speed,"turbo") ? ($hasChar ? 0.143 : 0.0429) :
0.0858;
{"type":"usd","usd": $round($base * $n, 2)}
)
""",
),
)
@classmethod

View File

@@ -567,7 +567,7 @@ async def execute_lipsync(
# Upload the audio file to Comfy API and get download URL
if audio:
audio_url = await upload_audio_to_comfyapi(
cls, audio, container_format="mp3", codec_name="libmp3lame", mime_type="audio/mpeg", filename="output.mp3"
cls, audio, container_format="mp3", codec_name="libmp3lame", mime_type="audio/mpeg"
)
logging.info("Uploaded audio to Comfy API. URL: %s", audio_url)
else:
@@ -764,6 +764,33 @@ class KlingTextToVideoNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["mode"]),
expr="""
(
$m := widgets.mode;
$contains($m,"v2-5-turbo")
? ($contains($m,"10") ? {"type":"usd","usd":0.7} : {"type":"usd","usd":0.35})
: $contains($m,"v2-1-master")
? ($contains($m,"10s") ? {"type":"usd","usd":2.8} : {"type":"usd","usd":1.4})
: $contains($m,"v2-master")
? ($contains($m,"10s") ? {"type":"usd","usd":2.8} : {"type":"usd","usd":1.4})
: $contains($m,"v1-6")
? (
$contains($m,"pro")
? ($contains($m,"10s") ? {"type":"usd","usd":0.98} : {"type":"usd","usd":0.49})
: ($contains($m,"10s") ? {"type":"usd","usd":0.56} : {"type":"usd","usd":0.28})
)
: $contains($m,"v1")
? (
$contains($m,"pro")
? ($contains($m,"10s") ? {"type":"usd","usd":0.98} : {"type":"usd","usd":0.49})
: ($contains($m,"10s") ? {"type":"usd","usd":0.28} : {"type":"usd","usd":0.14})
)
: {"type":"usd","usd":0.14}
)
""",
),
)
@classmethod
@@ -818,6 +845,16 @@ class OmniProTextToVideoNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["duration", "resolution"]),
expr="""
(
$mode := (widgets.resolution = "720p") ? "std" : "pro";
$rates := {"std": 0.084, "pro": 0.112};
{"type":"usd","usd": $lookup($rates, $mode) * widgets.duration}
)
""",
),
)
@classmethod
@@ -886,6 +923,16 @@ class OmniProFirstLastFrameNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["duration", "resolution"]),
expr="""
(
$mode := (widgets.resolution = "720p") ? "std" : "pro";
$rates := {"std": 0.084, "pro": 0.112};
{"type":"usd","usd": $lookup($rates, $mode) * widgets.duration}
)
""",
),
)
@classmethod
@@ -981,6 +1028,16 @@ class OmniProImageToVideoNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["duration", "resolution"]),
expr="""
(
$mode := (widgets.resolution = "720p") ? "std" : "pro";
$rates := {"std": 0.084, "pro": 0.112};
{"type":"usd","usd": $lookup($rates, $mode) * widgets.duration}
)
""",
),
)
@classmethod
@@ -1056,6 +1113,16 @@ class OmniProVideoToVideoNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["duration", "resolution"]),
expr="""
(
$mode := (widgets.resolution = "720p") ? "std" : "pro";
$rates := {"std": 0.126, "pro": 0.168};
{"type":"usd","usd": $lookup($rates, $mode) * widgets.duration}
)
""",
),
)
@classmethod
@@ -1142,6 +1209,16 @@ class OmniProEditVideoNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["resolution"]),
expr="""
(
$mode := (widgets.resolution = "720p") ? "std" : "pro";
$rates := {"std": 0.126, "pro": 0.168};
{"type":"usd","usd": $lookup($rates, $mode), "format":{"suffix":"/second"}}
)
""",
),
)
@classmethod
@@ -1228,6 +1305,9 @@ class OmniProImageNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
expr="""{"type":"usd","usd":0.028}""",
),
)
@classmethod
@@ -1313,6 +1393,9 @@ class KlingCameraControlT2VNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
expr="""{"type":"usd","usd":0.14}""",
),
)
@classmethod
@@ -1375,6 +1458,33 @@ class KlingImage2VideoNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["mode", "model_name", "duration"]),
expr="""
(
$mode := widgets.mode;
$model := widgets.model_name;
$dur := widgets.duration;
$contains($model,"v2-5-turbo")
? ($contains($dur,"10") ? {"type":"usd","usd":0.7} : {"type":"usd","usd":0.35})
: ($contains($model,"v2-1-master") or $contains($model,"v2-master"))
? ($contains($dur,"10") ? {"type":"usd","usd":2.8} : {"type":"usd","usd":1.4})
: ($contains($model,"v2-1") or $contains($model,"v1-6") or $contains($model,"v1-5"))
? (
$contains($mode,"pro")
? ($contains($dur,"10") ? {"type":"usd","usd":0.98} : {"type":"usd","usd":0.49})
: ($contains($dur,"10") ? {"type":"usd","usd":0.56} : {"type":"usd","usd":0.28})
)
: $contains($model,"v1")
? (
$contains($mode,"pro")
? ($contains($dur,"10") ? {"type":"usd","usd":0.98} : {"type":"usd","usd":0.49})
: ($contains($dur,"10") ? {"type":"usd","usd":0.28} : {"type":"usd","usd":0.14})
)
: {"type":"usd","usd":0.14}
)
""",
),
)
@classmethod
@@ -1448,6 +1558,9 @@ class KlingCameraControlI2VNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
expr="""{"type":"usd","usd":0.49}""",
),
)
@classmethod
@@ -1518,6 +1631,33 @@ class KlingStartEndFrameNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["mode"]),
expr="""
(
$m := widgets.mode;
$contains($m,"v2-5-turbo")
? ($contains($m,"10") ? {"type":"usd","usd":0.7} : {"type":"usd","usd":0.35})
: $contains($m,"v2-1")
? ($contains($m,"10s") ? {"type":"usd","usd":0.98} : {"type":"usd","usd":0.49})
: $contains($m,"v2-master")
? ($contains($m,"10s") ? {"type":"usd","usd":2.8} : {"type":"usd","usd":1.4})
: $contains($m,"v1-6")
? (
$contains($m,"pro")
? ($contains($m,"10s") ? {"type":"usd","usd":0.98} : {"type":"usd","usd":0.49})
: ($contains($m,"10s") ? {"type":"usd","usd":0.56} : {"type":"usd","usd":0.28})
)
: $contains($m,"v1")
? (
$contains($m,"pro")
? ($contains($m,"10s") ? {"type":"usd","usd":0.98} : {"type":"usd","usd":0.49})
: ($contains($m,"10s") ? {"type":"usd","usd":0.28} : {"type":"usd","usd":0.14})
)
: {"type":"usd","usd":0.14}
)
""",
),
)
@classmethod
@@ -1583,6 +1723,9 @@ class KlingVideoExtendNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
expr="""{"type":"usd","usd":0.28}""",
),
)
@classmethod
@@ -1664,6 +1807,29 @@ class KlingDualCharacterVideoEffectNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["mode", "model_name", "duration"]),
expr="""
(
$mode := widgets.mode;
$model := widgets.model_name;
$dur := widgets.duration;
($contains($model,"v1-6") or $contains($model,"v1-5"))
? (
$contains($mode,"pro")
? ($contains($dur,"10") ? {"type":"usd","usd":0.98} : {"type":"usd","usd":0.49})
: ($contains($dur,"10") ? {"type":"usd","usd":0.56} : {"type":"usd","usd":0.28})
)
: $contains($model,"v1")
? (
$contains($mode,"pro")
? ($contains($dur,"10") ? {"type":"usd","usd":0.98} : {"type":"usd","usd":0.49})
: ($contains($dur,"10") ? {"type":"usd","usd":0.28} : {"type":"usd","usd":0.14})
)
: {"type":"usd","usd":0.14}
)
""",
),
)
@classmethod
@@ -1728,6 +1894,16 @@ class KlingSingleImageVideoEffectNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["effect_scene"]),
expr="""
(
($contains(widgets.effect_scene,"dizzydizzy") or $contains(widgets.effect_scene,"bloombloom"))
? {"type":"usd","usd":0.49}
: {"type":"usd","usd":0.28}
)
""",
),
)
@classmethod
@@ -1782,6 +1958,9 @@ class KlingLipSyncAudioToVideoNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
expr="""{"type":"usd","usd":0.1,"format":{"approximate":true}}""",
),
)
@classmethod
@@ -1842,6 +2021,9 @@ class KlingLipSyncTextToVideoNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
expr="""{"type":"usd","usd":0.1,"format":{"approximate":true}}""",
),
)
@classmethod
@@ -1892,6 +2074,9 @@ class KlingVirtualTryOnNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
expr="""{"type":"usd","usd":0.7}""",
),
)
@classmethod
@@ -1991,6 +2176,19 @@ class KlingImageGenerationNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["model_name", "n"], inputs=["image"]),
expr="""
(
$m := widgets.model_name;
$base :=
$contains($m,"kling-v1-5")
? (inputs.image.connected ? 0.028 : 0.014)
: ($contains($m,"kling-v1") ? 0.0035 : 0.014);
{"type":"usd","usd": $base * widgets.n}
)
""",
),
)
@classmethod
@@ -2074,6 +2272,10 @@ class TextToVideoWithAudio(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["duration", "generate_audio"]),
expr="""{"type":"usd","usd": 0.07 * widgets.duration * (widgets.generate_audio ? 2 : 1)}""",
),
)
@classmethod
@@ -2138,6 +2340,10 @@ class ImageToVideoWithAudio(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["duration", "generate_audio"]),
expr="""{"type":"usd","usd": 0.07 * widgets.duration * (widgets.generate_audio ? 2 : 1)}""",
),
)
@classmethod
@@ -2218,6 +2424,15 @@ class MotionControl(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["mode"]),
expr="""
(
$prices := {"std": 0.07, "pro": 0.112};
{"type":"usd","usd": $lookup($prices, widgets.mode), "format":{"suffix":"/second"}}
)
""",
),
)
@classmethod

View File

@@ -28,6 +28,22 @@ class ExecuteTaskRequest(BaseModel):
image_uri: str | None = Field(None)
PRICE_BADGE = IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["model", "duration", "resolution"]),
expr="""
(
$prices := {
"ltx-2 (pro)": {"1920x1080":0.06,"2560x1440":0.12,"3840x2160":0.24},
"ltx-2 (fast)": {"1920x1080":0.04,"2560x1440":0.08,"3840x2160":0.16}
};
$modelPrices := $lookup($prices, $lowercase(widgets.model));
$pps := $lookup($modelPrices, widgets.resolution);
{"type":"usd","usd": $pps * widgets.duration}
)
""",
)
class TextToVideoNode(IO.ComfyNode):
@classmethod
def define_schema(cls):
@@ -69,6 +85,7 @@ class TextToVideoNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=PRICE_BADGE,
)
@classmethod
@@ -145,6 +162,7 @@ class ImageToVideoNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=PRICE_BADGE,
)
@classmethod

View File

@@ -189,6 +189,19 @@ class LumaImageGenerationNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["model"]),
expr="""
(
$m := widgets.model;
$contains($m,"photon-flash-1")
? {"type":"usd","usd":0.0027}
: $contains($m,"photon-1")
? {"type":"usd","usd":0.0104}
: {"type":"usd","usd":0.0246}
)
""",
),
)
@classmethod
@@ -303,6 +316,19 @@ class LumaImageModifyNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["model"]),
expr="""
(
$m := widgets.model;
$contains($m,"photon-flash-1")
? {"type":"usd","usd":0.0027}
: $contains($m,"photon-1")
? {"type":"usd","usd":0.0104}
: {"type":"usd","usd":0.0246}
)
""",
),
)
@classmethod
@@ -395,6 +421,7 @@ class LumaTextToVideoGenerationNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=PRICE_BADGE_VIDEO,
)
@classmethod
@@ -505,6 +532,8 @@ class LumaImageToVideoGenerationNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=PRICE_BADGE_VIDEO,
)
@classmethod
@@ -568,6 +597,53 @@ class LumaImageToVideoGenerationNode(IO.ComfyNode):
return LumaKeyframes(frame0=frame0, frame1=frame1)
PRICE_BADGE_VIDEO = IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["model", "resolution", "duration"]),
expr="""
(
$p := {
"ray-flash-2": {
"5s": {"4k":3.13,"1080p":0.79,"720p":0.34,"540p":0.2},
"9s": {"4k":5.65,"1080p":1.42,"720p":0.61,"540p":0.36}
},
"ray-2": {
"5s": {"4k":9.11,"1080p":2.27,"720p":1.02,"540p":0.57},
"9s": {"4k":16.4,"1080p":4.1,"720p":1.83,"540p":1.03}
}
};
$m := widgets.model;
$d := widgets.duration;
$r := widgets.resolution;
$modelKey :=
$contains($m,"ray-flash-2") ? "ray-flash-2" :
$contains($m,"ray-2") ? "ray-2" :
$contains($m,"ray-1-6") ? "ray-1-6" :
"other";
$durKey := $contains($d,"5s") ? "5s" : $contains($d,"9s") ? "9s" : "";
$resKey :=
$contains($r,"4k") ? "4k" :
$contains($r,"1080p") ? "1080p" :
$contains($r,"720p") ? "720p" :
$contains($r,"540p") ? "540p" : "";
$modelPrices := $lookup($p, $modelKey);
$durPrices := $lookup($modelPrices, $durKey);
$v := $lookup($durPrices, $resKey);
$price :=
($modelKey = "ray-1-6") ? 0.5 :
($modelKey = "other") ? 0.79 :
($exists($v) ? $v : 0.79);
{"type":"usd","usd": $price}
)
""",
)
class LumaExtension(ComfyExtension):
@override
async def get_node_list(self) -> list[type[IO.ComfyNode]]:

View File

@@ -0,0 +1,790 @@
import os
from typing_extensions import override
from comfy_api.latest import IO, ComfyExtension, Input
from comfy_api_nodes.apis.meshy import (
InputShouldRemesh,
InputShouldTexture,
MeshyAnimationRequest,
MeshyAnimationResult,
MeshyImageToModelRequest,
MeshyModelResult,
MeshyMultiImageToModelRequest,
MeshyRefineTask,
MeshyRiggedResult,
MeshyRiggingRequest,
MeshyTaskResponse,
MeshyTextToModelRequest,
MeshyTextureRequest,
)
from comfy_api_nodes.util import (
ApiEndpoint,
download_url_to_bytesio,
poll_op,
sync_op,
upload_images_to_comfyapi,
validate_string,
)
from folder_paths import get_output_directory
class MeshyTextToModelNode(IO.ComfyNode):
@classmethod
def define_schema(cls):
return IO.Schema(
node_id="MeshyTextToModelNode",
display_name="Meshy: Text to Model",
category="api node/3d/Meshy",
inputs=[
IO.Combo.Input("model", options=["latest"]),
IO.String.Input("prompt", multiline=True, default=""),
IO.Combo.Input("style", options=["realistic", "sculpture"]),
IO.DynamicCombo.Input(
"should_remesh",
options=[
IO.DynamicCombo.Option(
"true",
[
IO.Combo.Input("topology", options=["triangle", "quad"]),
IO.Int.Input(
"target_polycount",
default=300000,
min=100,
max=300000,
display_mode=IO.NumberDisplay.number,
),
],
),
IO.DynamicCombo.Option("false", []),
],
tooltip="When set to false, returns an unprocessed triangular mesh.",
),
IO.Combo.Input("symmetry_mode", options=["auto", "on", "off"]),
IO.Combo.Input(
"pose_mode",
options=["", "A-pose", "T-pose"],
tooltip="Specify the pose mode for the generated model.",
),
IO.Int.Input(
"seed",
default=0,
min=0,
max=2147483647,
display_mode=IO.NumberDisplay.number,
control_after_generate=True,
tooltip="Seed controls whether the node should re-run; "
"results are non-deterministic regardless of seed.",
),
],
outputs=[
IO.String.Output(display_name="model_file"),
IO.Custom("MESHY_TASK_ID").Output(display_name="meshy_task_id"),
],
hidden=[
IO.Hidden.auth_token_comfy_org,
IO.Hidden.api_key_comfy_org,
IO.Hidden.unique_id,
],
is_api_node=True,
is_output_node=True,
price_badge=IO.PriceBadge(
expr="""{"type":"usd","usd":0.8}""",
),
)
@classmethod
async def execute(
cls,
model: str,
prompt: str,
style: str,
should_remesh: InputShouldRemesh,
symmetry_mode: str,
pose_mode: str,
seed: int,
) -> IO.NodeOutput:
validate_string(prompt, field_name="prompt", min_length=1, max_length=600)
response = await sync_op(
cls,
ApiEndpoint(path="/proxy/meshy/openapi/v2/text-to-3d", method="POST"),
response_model=MeshyTaskResponse,
data=MeshyTextToModelRequest(
prompt=prompt,
art_style=style,
ai_model=model,
topology=should_remesh.get("topology", None),
target_polycount=should_remesh.get("target_polycount", None),
should_remesh=should_remesh["should_remesh"] == "true",
symmetry_mode=symmetry_mode,
pose_mode=pose_mode.lower(),
seed=seed,
),
)
result = await poll_op(
cls,
ApiEndpoint(path=f"/proxy/meshy/openapi/v2/text-to-3d/{response.result}"),
response_model=MeshyModelResult,
status_extractor=lambda r: r.status,
progress_extractor=lambda r: r.progress,
)
model_file = f"meshy_model_{response.result}.glb"
await download_url_to_bytesio(result.model_urls.glb, os.path.join(get_output_directory(), model_file))
return IO.NodeOutput(model_file, response.result)
class MeshyRefineNode(IO.ComfyNode):
@classmethod
def define_schema(cls):
return IO.Schema(
node_id="MeshyRefineNode",
display_name="Meshy: Refine Draft Model",
category="api node/3d/Meshy",
description="Refine a previously created draft model.",
inputs=[
IO.Combo.Input("model", options=["latest"]),
IO.Custom("MESHY_TASK_ID").Input("meshy_task_id"),
IO.Boolean.Input(
"enable_pbr",
default=False,
tooltip="Generate PBR Maps (metallic, roughness, normal) in addition to the base color. "
"Note: this should be set to false when using Sculpture style, "
"as Sculpture style generates its own set of PBR maps.",
),
IO.String.Input(
"texture_prompt",
default="",
multiline=True,
tooltip="Provide a text prompt to guide the texturing process. "
"Maximum 600 characters. Cannot be used at the same time as 'texture_image'.",
),
IO.Image.Input(
"texture_image",
tooltip="Only one of 'texture_image' or 'texture_prompt' may be used at the same time.",
optional=True,
),
],
outputs=[
IO.String.Output(display_name="model_file"),
IO.Custom("MESHY_TASK_ID").Output(display_name="meshy_task_id"),
],
hidden=[
IO.Hidden.auth_token_comfy_org,
IO.Hidden.api_key_comfy_org,
IO.Hidden.unique_id,
],
is_api_node=True,
is_output_node=True,
price_badge=IO.PriceBadge(
expr="""{"type":"usd","usd":0.4}""",
),
)
@classmethod
async def execute(
cls,
model: str,
meshy_task_id: str,
enable_pbr: bool,
texture_prompt: str,
texture_image: Input.Image | None = None,
) -> IO.NodeOutput:
if texture_prompt and texture_image is not None:
raise ValueError("texture_prompt and texture_image cannot be used at the same time")
texture_image_url = None
if texture_prompt:
validate_string(texture_prompt, field_name="texture_prompt", max_length=600)
if texture_image is not None:
texture_image_url = (await upload_images_to_comfyapi(cls, texture_image, wait_label="Uploading texture"))[0]
response = await sync_op(
cls,
endpoint=ApiEndpoint(path="/proxy/meshy/openapi/v2/text-to-3d", method="POST"),
response_model=MeshyTaskResponse,
data=MeshyRefineTask(
preview_task_id=meshy_task_id,
enable_pbr=enable_pbr,
texture_prompt=texture_prompt if texture_prompt else None,
texture_image_url=texture_image_url,
ai_model=model,
),
)
result = await poll_op(
cls,
ApiEndpoint(path=f"/proxy/meshy/openapi/v2/text-to-3d/{response.result}"),
response_model=MeshyModelResult,
status_extractor=lambda r: r.status,
progress_extractor=lambda r: r.progress,
)
model_file = f"meshy_model_{response.result}.glb"
await download_url_to_bytesio(result.model_urls.glb, os.path.join(get_output_directory(), model_file))
return IO.NodeOutput(model_file, response.result)
class MeshyImageToModelNode(IO.ComfyNode):
@classmethod
def define_schema(cls):
return IO.Schema(
node_id="MeshyImageToModelNode",
display_name="Meshy: Image to Model",
category="api node/3d/Meshy",
inputs=[
IO.Combo.Input("model", options=["latest"]),
IO.Image.Input("image"),
IO.DynamicCombo.Input(
"should_remesh",
options=[
IO.DynamicCombo.Option(
"true",
[
IO.Combo.Input("topology", options=["triangle", "quad"]),
IO.Int.Input(
"target_polycount",
default=300000,
min=100,
max=300000,
display_mode=IO.NumberDisplay.number,
),
],
),
IO.DynamicCombo.Option("false", []),
],
tooltip="When set to false, returns an unprocessed triangular mesh.",
),
IO.Combo.Input("symmetry_mode", options=["auto", "on", "off"]),
IO.DynamicCombo.Input(
"should_texture",
options=[
IO.DynamicCombo.Option(
"true",
[
IO.Boolean.Input(
"enable_pbr",
default=False,
tooltip="Generate PBR Maps (metallic, roughness, normal) "
"in addition to the base color.",
),
IO.String.Input(
"texture_prompt",
default="",
multiline=True,
tooltip="Provide a text prompt to guide the texturing process. "
"Maximum 600 characters. Cannot be used at the same time as 'texture_image'.",
),
IO.Image.Input(
"texture_image",
tooltip="Only one of 'texture_image' or 'texture_prompt' "
"may be used at the same time.",
optional=True,
),
],
),
IO.DynamicCombo.Option("false", []),
],
tooltip="Determines whether textures are generated. "
"Setting it to false skips the texture phase and returns a mesh without textures.",
),
IO.Combo.Input(
"pose_mode",
options=["", "A-pose", "T-pose"],
tooltip="Specify the pose mode for the generated model.",
),
IO.Int.Input(
"seed",
default=0,
min=0,
max=2147483647,
display_mode=IO.NumberDisplay.number,
control_after_generate=True,
tooltip="Seed controls whether the node should re-run; "
"results are non-deterministic regardless of seed.",
),
],
outputs=[
IO.String.Output(display_name="model_file"),
IO.Custom("MESHY_TASK_ID").Output(display_name="meshy_task_id"),
],
hidden=[
IO.Hidden.auth_token_comfy_org,
IO.Hidden.api_key_comfy_org,
IO.Hidden.unique_id,
],
is_api_node=True,
is_output_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["should_texture"]),
expr="""
(
$prices := {"true": 1.2, "false": 0.8};
{"type":"usd","usd": $lookup($prices, widgets.should_texture)}
)
""",
),
)
@classmethod
async def execute(
cls,
model: str,
image: Input.Image,
should_remesh: InputShouldRemesh,
symmetry_mode: str,
should_texture: InputShouldTexture,
pose_mode: str,
seed: int,
) -> IO.NodeOutput:
texture = should_texture["should_texture"] == "true"
texture_image_url = texture_prompt = None
if texture:
if should_texture["texture_prompt"] and should_texture["texture_image"] is not None:
raise ValueError("texture_prompt and texture_image cannot be used at the same time")
if should_texture["texture_prompt"]:
validate_string(should_texture["texture_prompt"], field_name="texture_prompt", max_length=600)
texture_prompt = should_texture["texture_prompt"]
if should_texture["texture_image"] is not None:
texture_image_url = (
await upload_images_to_comfyapi(
cls, should_texture["texture_image"], wait_label="Uploading texture"
)
)[0]
response = await sync_op(
cls,
ApiEndpoint(path="/proxy/meshy/openapi/v1/image-to-3d", method="POST"),
response_model=MeshyTaskResponse,
data=MeshyImageToModelRequest(
image_url=(await upload_images_to_comfyapi(cls, image, wait_label="Uploading base image"))[0],
ai_model=model,
topology=should_remesh.get("topology", None),
target_polycount=should_remesh.get("target_polycount", None),
symmetry_mode=symmetry_mode,
should_remesh=should_remesh["should_remesh"] == "true",
should_texture=texture,
enable_pbr=should_texture.get("enable_pbr", None),
pose_mode=pose_mode.lower(),
texture_prompt=texture_prompt,
texture_image_url=texture_image_url,
seed=seed,
),
)
result = await poll_op(
cls,
ApiEndpoint(path=f"/proxy/meshy/openapi/v1/image-to-3d/{response.result}"),
response_model=MeshyModelResult,
status_extractor=lambda r: r.status,
progress_extractor=lambda r: r.progress,
)
model_file = f"meshy_model_{response.result}.glb"
await download_url_to_bytesio(result.model_urls.glb, os.path.join(get_output_directory(), model_file))
return IO.NodeOutput(model_file, response.result)
class MeshyMultiImageToModelNode(IO.ComfyNode):
@classmethod
def define_schema(cls):
return IO.Schema(
node_id="MeshyMultiImageToModelNode",
display_name="Meshy: Multi-Image to Model",
category="api node/3d/Meshy",
inputs=[
IO.Combo.Input("model", options=["latest"]),
IO.Autogrow.Input(
"images",
template=IO.Autogrow.TemplatePrefix(IO.Image.Input("image"), prefix="image", min=2, max=4),
),
IO.DynamicCombo.Input(
"should_remesh",
options=[
IO.DynamicCombo.Option(
"true",
[
IO.Combo.Input("topology", options=["triangle", "quad"]),
IO.Int.Input(
"target_polycount",
default=300000,
min=100,
max=300000,
display_mode=IO.NumberDisplay.number,
),
],
),
IO.DynamicCombo.Option("false", []),
],
tooltip="When set to false, returns an unprocessed triangular mesh.",
),
IO.Combo.Input("symmetry_mode", options=["auto", "on", "off"]),
IO.DynamicCombo.Input(
"should_texture",
options=[
IO.DynamicCombo.Option(
"true",
[
IO.Boolean.Input(
"enable_pbr",
default=False,
tooltip="Generate PBR Maps (metallic, roughness, normal) "
"in addition to the base color.",
),
IO.String.Input(
"texture_prompt",
default="",
multiline=True,
tooltip="Provide a text prompt to guide the texturing process. "
"Maximum 600 characters. Cannot be used at the same time as 'texture_image'.",
),
IO.Image.Input(
"texture_image",
tooltip="Only one of 'texture_image' or 'texture_prompt' "
"may be used at the same time.",
optional=True,
),
],
),
IO.DynamicCombo.Option("false", []),
],
tooltip="Determines whether textures are generated. "
"Setting it to false skips the texture phase and returns a mesh without textures.",
),
IO.Combo.Input(
"pose_mode",
options=["", "A-pose", "T-pose"],
tooltip="Specify the pose mode for the generated model.",
),
IO.Int.Input(
"seed",
default=0,
min=0,
max=2147483647,
display_mode=IO.NumberDisplay.number,
control_after_generate=True,
tooltip="Seed controls whether the node should re-run; "
"results are non-deterministic regardless of seed.",
),
],
outputs=[
IO.String.Output(display_name="model_file"),
IO.Custom("MESHY_TASK_ID").Output(display_name="meshy_task_id"),
],
hidden=[
IO.Hidden.auth_token_comfy_org,
IO.Hidden.api_key_comfy_org,
IO.Hidden.unique_id,
],
is_api_node=True,
is_output_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["should_texture"]),
expr="""
(
$prices := {"true": 0.6, "false": 0.2};
{"type":"usd","usd": $lookup($prices, widgets.should_texture)}
)
""",
),
)
@classmethod
async def execute(
cls,
model: str,
images: IO.Autogrow.Type,
should_remesh: InputShouldRemesh,
symmetry_mode: str,
should_texture: InputShouldTexture,
pose_mode: str,
seed: int,
) -> IO.NodeOutput:
texture = should_texture["should_texture"] == "true"
texture_image_url = texture_prompt = None
if texture:
if should_texture["texture_prompt"] and should_texture["texture_image"] is not None:
raise ValueError("texture_prompt and texture_image cannot be used at the same time")
if should_texture["texture_prompt"]:
validate_string(should_texture["texture_prompt"], field_name="texture_prompt", max_length=600)
texture_prompt = should_texture["texture_prompt"]
if should_texture["texture_image"] is not None:
texture_image_url = (
await upload_images_to_comfyapi(
cls, should_texture["texture_image"], wait_label="Uploading texture"
)
)[0]
response = await sync_op(
cls,
ApiEndpoint(path="/proxy/meshy/openapi/v1/multi-image-to-3d", method="POST"),
response_model=MeshyTaskResponse,
data=MeshyMultiImageToModelRequest(
image_urls=await upload_images_to_comfyapi(
cls, list(images.values()), wait_label="Uploading base images"
),
ai_model=model,
topology=should_remesh.get("topology", None),
target_polycount=should_remesh.get("target_polycount", None),
symmetry_mode=symmetry_mode,
should_remesh=should_remesh["should_remesh"] == "true",
should_texture=texture,
enable_pbr=should_texture.get("enable_pbr", None),
pose_mode=pose_mode.lower(),
texture_prompt=texture_prompt,
texture_image_url=texture_image_url,
seed=seed,
),
)
result = await poll_op(
cls,
ApiEndpoint(path=f"/proxy/meshy/openapi/v1/multi-image-to-3d/{response.result}"),
response_model=MeshyModelResult,
status_extractor=lambda r: r.status,
progress_extractor=lambda r: r.progress,
)
model_file = f"meshy_model_{response.result}.glb"
await download_url_to_bytesio(result.model_urls.glb, os.path.join(get_output_directory(), model_file))
return IO.NodeOutput(model_file, response.result)
class MeshyRigModelNode(IO.ComfyNode):
@classmethod
def define_schema(cls):
return IO.Schema(
node_id="MeshyRigModelNode",
display_name="Meshy: Rig Model",
category="api node/3d/Meshy",
description="Provides a rigged character in standard formats. "
"Auto-rigging is currently not suitable for untextured meshes, non-humanoid assets, "
"or humanoid assets with unclear limb and body structure.",
inputs=[
IO.Custom("MESHY_TASK_ID").Input("meshy_task_id"),
IO.Float.Input(
"height_meters",
min=0.1,
max=15.0,
default=1.7,
tooltip="The approximate height of the character model in meters. "
"This aids in scaling and rigging accuracy.",
),
IO.Image.Input(
"texture_image",
tooltip="The model's UV-unwrapped base color texture image.",
optional=True,
),
],
outputs=[
IO.String.Output(display_name="model_file"),
IO.Custom("MESHY_RIGGED_TASK_ID").Output(display_name="rig_task_id"),
],
hidden=[
IO.Hidden.auth_token_comfy_org,
IO.Hidden.api_key_comfy_org,
IO.Hidden.unique_id,
],
is_api_node=True,
is_output_node=True,
price_badge=IO.PriceBadge(
expr="""{"type":"usd","usd":0.2}""",
),
)
@classmethod
async def execute(
cls,
meshy_task_id: str,
height_meters: float,
texture_image: Input.Image | None = None,
) -> IO.NodeOutput:
texture_image_url = None
if texture_image is not None:
texture_image_url = (await upload_images_to_comfyapi(cls, texture_image, wait_label="Uploading texture"))[0]
response = await sync_op(
cls,
endpoint=ApiEndpoint(path="/proxy/meshy/openapi/v1/rigging", method="POST"),
response_model=MeshyTaskResponse,
data=MeshyRiggingRequest(
input_task_id=meshy_task_id,
height_meters=height_meters,
texture_image_url=texture_image_url,
),
)
result = await poll_op(
cls,
ApiEndpoint(path=f"/proxy/meshy/openapi/v1/rigging/{response.result}"),
response_model=MeshyRiggedResult,
status_extractor=lambda r: r.status,
progress_extractor=lambda r: r.progress,
)
model_file = f"meshy_model_{response.result}.glb"
await download_url_to_bytesio(
result.result.rigged_character_glb_url, os.path.join(get_output_directory(), model_file)
)
return IO.NodeOutput(model_file, response.result)
class MeshyAnimateModelNode(IO.ComfyNode):
@classmethod
def define_schema(cls):
return IO.Schema(
node_id="MeshyAnimateModelNode",
display_name="Meshy: Animate Model",
category="api node/3d/Meshy",
description="Apply a specific animation action to a previously rigged character.",
inputs=[
IO.Custom("MESHY_RIGGED_TASK_ID").Input("rig_task_id"),
IO.Int.Input(
"action_id",
default=0,
min=0,
max=696,
tooltip="Visit https://docs.meshy.ai/en/api/animation-library for a list of available values.",
),
],
outputs=[
IO.String.Output(display_name="model_file"),
],
hidden=[
IO.Hidden.auth_token_comfy_org,
IO.Hidden.api_key_comfy_org,
IO.Hidden.unique_id,
],
is_api_node=True,
is_output_node=True,
price_badge=IO.PriceBadge(
expr="""{"type":"usd","usd":0.12}""",
),
)
@classmethod
async def execute(
cls,
rig_task_id: str,
action_id: int,
) -> IO.NodeOutput:
response = await sync_op(
cls,
endpoint=ApiEndpoint(path="/proxy/meshy/openapi/v1/animations", method="POST"),
response_model=MeshyTaskResponse,
data=MeshyAnimationRequest(
rig_task_id=rig_task_id,
action_id=action_id,
),
)
result = await poll_op(
cls,
ApiEndpoint(path=f"/proxy/meshy/openapi/v1/animations/{response.result}"),
response_model=MeshyAnimationResult,
status_extractor=lambda r: r.status,
progress_extractor=lambda r: r.progress,
)
model_file = f"meshy_model_{response.result}.glb"
await download_url_to_bytesio(result.result.animation_glb_url, os.path.join(get_output_directory(), model_file))
return IO.NodeOutput(model_file, response.result)
class MeshyTextureNode(IO.ComfyNode):
@classmethod
def define_schema(cls):
return IO.Schema(
node_id="MeshyTextureNode",
display_name="Meshy: Texture Model",
category="api node/3d/Meshy",
inputs=[
IO.Combo.Input("model", options=["latest"]),
IO.Custom("MESHY_TASK_ID").Input("meshy_task_id"),
IO.Boolean.Input(
"enable_original_uv",
default=True,
tooltip="Use the original UV of the model instead of generating new UVs. "
"When enabled, Meshy preserves existing textures from the uploaded model. "
"If the model has no original UV, the quality of the output might not be as good.",
),
IO.Boolean.Input("pbr", default=False),
IO.String.Input(
"text_style_prompt",
default="",
multiline=True,
tooltip="Describe your desired texture style of the object using text. Maximum 600 characters."
"Maximum 600 characters. Cannot be used at the same time as 'image_style'.",
),
IO.Image.Input(
"image_style",
optional=True,
tooltip="A 2d image to guide the texturing process. "
"Can not be used at the same time with 'text_style_prompt'.",
),
],
outputs=[
IO.String.Output(display_name="model_file"),
IO.Custom("MODEL_TASK_ID").Output(display_name="meshy_task_id"),
],
hidden=[
IO.Hidden.auth_token_comfy_org,
IO.Hidden.api_key_comfy_org,
IO.Hidden.unique_id,
],
is_api_node=True,
is_output_node=True,
price_badge=IO.PriceBadge(
expr="""{"type":"usd","usd":0.4}""",
),
)
@classmethod
async def execute(
cls,
model: str,
meshy_task_id: str,
enable_original_uv: bool,
pbr: bool,
text_style_prompt: str,
image_style: Input.Image | None = None,
) -> IO.NodeOutput:
if text_style_prompt and image_style is not None:
raise ValueError("text_style_prompt and image_style cannot be used at the same time")
if not text_style_prompt and image_style is None:
raise ValueError("Either text_style_prompt or image_style is required")
image_style_url = None
if image_style is not None:
image_style_url = (await upload_images_to_comfyapi(cls, image_style, wait_label="Uploading style"))[0]
response = await sync_op(
cls,
endpoint=ApiEndpoint(path="/proxy/meshy/openapi/v1/retexture", method="POST"),
response_model=MeshyTaskResponse,
data=MeshyTextureRequest(
input_task_id=meshy_task_id,
ai_model=model,
enable_original_uv=enable_original_uv,
enable_pbr=pbr,
text_style_prompt=text_style_prompt if text_style_prompt else None,
image_style_url=image_style_url,
),
)
result = await poll_op(
cls,
ApiEndpoint(path=f"/proxy/meshy/openapi/v1/retexture/{response.result}"),
response_model=MeshyModelResult,
status_extractor=lambda r: r.status,
progress_extractor=lambda r: r.progress,
)
model_file = f"meshy_model_{response.result}.glb"
await download_url_to_bytesio(result.model_urls.glb, os.path.join(get_output_directory(), model_file))
return IO.NodeOutput(model_file, response.result)
class MeshyExtension(ComfyExtension):
@override
async def get_node_list(self) -> list[type[IO.ComfyNode]]:
return [
MeshyTextToModelNode,
MeshyRefineNode,
MeshyImageToModelNode,
MeshyMultiImageToModelNode,
MeshyRigModelNode,
MeshyAnimateModelNode,
MeshyTextureNode,
]
async def comfy_entrypoint() -> MeshyExtension:
return MeshyExtension()

View File

@@ -134,6 +134,9 @@ class MinimaxTextToVideoNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
expr="""{"type":"usd","usd":0.43}""",
),
)
@classmethod
@@ -197,6 +200,9 @@ class MinimaxImageToVideoNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
expr="""{"type":"usd","usd":0.43}""",
),
)
@classmethod
@@ -340,6 +346,20 @@ class MinimaxHailuoVideoNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["resolution", "duration"]),
expr="""
(
$prices := {
"768p": {"6": 0.28, "10": 0.56},
"1080p": {"6": 0.49}
};
$resPrices := $lookup($prices, $lowercase(widgets.resolution));
$price := $lookup($resPrices, $string(widgets.duration));
{"type":"usd","usd": $price ? $price : 0.43}
)
""",
),
)
@classmethod

View File

@@ -233,6 +233,10 @@ class MoonvalleyImg2VideoNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(),
expr="""{"type":"usd","usd": 1.5}""",
),
)
@classmethod
@@ -351,6 +355,10 @@ class MoonvalleyVideo2VideoNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(),
expr="""{"type":"usd","usd": 2.25}""",
),
)
@classmethod
@@ -471,6 +479,10 @@ class MoonvalleyTxt2VideoNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(),
expr="""{"type":"usd","usd": 1.5}""",
),
)
@classmethod

View File

@@ -160,6 +160,23 @@ class OpenAIDalle2(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["size", "n"]),
expr="""
(
$size := widgets.size;
$nRaw := widgets.n;
$n := ($nRaw != null and $nRaw != 0) ? $nRaw : 1;
$base :=
$contains($size, "256x256") ? 0.016 :
$contains($size, "512x512") ? 0.018 :
0.02;
{"type":"usd","usd": $round($base * $n, 3)}
)
""",
),
)
@classmethod
@@ -287,6 +304,25 @@ class OpenAIDalle3(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["size", "quality"]),
expr="""
(
$size := widgets.size;
$q := widgets.quality;
$hd := $contains($q, "hd");
$price :=
$contains($size, "1024x1024")
? ($hd ? 0.08 : 0.04)
: (($contains($size, "1792x1024") or $contains($size, "1024x1792"))
? ($hd ? 0.12 : 0.08)
: 0.04);
{"type":"usd","usd": $price}
)
""",
),
)
@classmethod
@@ -411,6 +447,28 @@ class OpenAIGPTImage1(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["quality", "n"]),
expr="""
(
$ranges := {
"low": [0.011, 0.02],
"medium": [0.046, 0.07],
"high": [0.167, 0.3]
};
$range := $lookup($ranges, widgets.quality);
$n := widgets.n;
($n = 1)
? {"type":"range_usd","min_usd": $range[0], "max_usd": $range[1]}
: {
"type":"range_usd",
"min_usd": $range[0],
"max_usd": $range[1],
"format": { "suffix": " x " & $string($n) & "/Run" }
}
)
""",
),
)
@classmethod
@@ -566,6 +624,75 @@ class OpenAIChatNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["model"]),
expr="""
(
$m := widgets.model;
$contains($m, "o4-mini") ? {
"type": "list_usd",
"usd": [0.0011, 0.0044],
"format": { "approximate": true, "separator": "-", "suffix": " per 1K tokens" }
}
: $contains($m, "o1-pro") ? {
"type": "list_usd",
"usd": [0.15, 0.6],
"format": { "approximate": true, "separator": "-", "suffix": " per 1K tokens" }
}
: $contains($m, "o1") ? {
"type": "list_usd",
"usd": [0.015, 0.06],
"format": { "approximate": true, "separator": "-", "suffix": " per 1K tokens" }
}
: $contains($m, "o3-mini") ? {
"type": "list_usd",
"usd": [0.0011, 0.0044],
"format": { "approximate": true, "separator": "-", "suffix": " per 1K tokens" }
}
: $contains($m, "o3") ? {
"type": "list_usd",
"usd": [0.01, 0.04],
"format": { "approximate": true, "separator": "-", "suffix": " per 1K tokens" }
}
: $contains($m, "gpt-4o") ? {
"type": "list_usd",
"usd": [0.0025, 0.01],
"format": { "approximate": true, "separator": "-", "suffix": " per 1K tokens" }
}
: $contains($m, "gpt-4.1-nano") ? {
"type": "list_usd",
"usd": [0.0001, 0.0004],
"format": { "approximate": true, "separator": "-", "suffix": " per 1K tokens" }
}
: $contains($m, "gpt-4.1-mini") ? {
"type": "list_usd",
"usd": [0.0004, 0.0016],
"format": { "approximate": true, "separator": "-", "suffix": " per 1K tokens" }
}
: $contains($m, "gpt-4.1") ? {
"type": "list_usd",
"usd": [0.002, 0.008],
"format": { "approximate": true, "separator": "-", "suffix": " per 1K tokens" }
}
: $contains($m, "gpt-5-nano") ? {
"type": "list_usd",
"usd": [0.00005, 0.0004],
"format": { "approximate": true, "separator": "-", "suffix": " per 1K tokens" }
}
: $contains($m, "gpt-5-mini") ? {
"type": "list_usd",
"usd": [0.00025, 0.002],
"format": { "approximate": true, "separator": "-", "suffix": " per 1K tokens" }
}
: $contains($m, "gpt-5") ? {
"type": "list_usd",
"usd": [0.00125, 0.01],
"format": { "approximate": true, "separator": "-", "suffix": " per 1K tokens" }
}
: {"type": "text", "text": "Token-based"}
)
""",
),
)
@classmethod

View File

@@ -128,6 +128,7 @@ class PixverseTextToVideoNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=PRICE_BADGE_VIDEO,
)
@classmethod
@@ -242,6 +243,7 @@ class PixverseImageToVideoNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=PRICE_BADGE_VIDEO,
)
@classmethod
@@ -355,6 +357,7 @@ class PixverseTransitionVideoNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=PRICE_BADGE_VIDEO,
)
@classmethod
@@ -416,6 +419,33 @@ class PixverseTransitionVideoNode(IO.ComfyNode):
return IO.NodeOutput(await download_url_to_video_output(response_poll.Resp.url))
PRICE_BADGE_VIDEO = IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["duration_seconds", "quality", "motion_mode"]),
expr="""
(
$prices := {
"5": {
"1080p": {"normal": 1.2, "fast": 1.2},
"720p": {"normal": 0.6, "fast": 1.2},
"540p": {"normal": 0.45, "fast": 0.9},
"360p": {"normal": 0.45, "fast": 0.9}
},
"8": {
"1080p": {"normal": 1.2, "fast": 1.2},
"720p": {"normal": 1.2, "fast": 1.2},
"540p": {"normal": 0.9, "fast": 1.2},
"360p": {"normal": 0.9, "fast": 1.2}
}
};
$durPrices := $lookup($prices, $string(widgets.duration_seconds));
$qualityPrices := $lookup($durPrices, widgets.quality);
$price := $lookup($qualityPrices, widgets.motion_mode);
{"type":"usd","usd": $price ? $price : 0.9}
)
""",
)
class PixVerseExtension(ComfyExtension):
@override
async def get_node_list(self) -> list[type[IO.ComfyNode]]:

View File

@@ -378,6 +378,10 @@ class RecraftTextToImageNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["n"]),
expr="""{"type":"usd","usd": $round(0.04 * widgets.n, 2)}""",
),
)
@classmethod
@@ -490,6 +494,10 @@ class RecraftImageToImageNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["n"]),
expr="""{"type":"usd","usd": $round(0.04 * widgets.n, 2)}""",
),
)
@classmethod
@@ -591,6 +599,10 @@ class RecraftImageInpaintingNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["n"]),
expr="""{"type":"usd","usd": $round(0.04 * widgets.n, 2)}""",
),
)
@classmethod
@@ -692,6 +704,10 @@ class RecraftTextToVectorNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["n"]),
expr="""{"type":"usd","usd": $round(0.08 * widgets.n, 2)}""",
),
)
@classmethod
@@ -759,6 +775,10 @@ class RecraftVectorizeImageNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(),
expr="""{"type":"usd","usd": 0.01}""",
),
)
@classmethod
@@ -817,6 +837,9 @@ class RecraftReplaceBackgroundNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
expr="""{"type":"usd","usd":0.04}""",
),
)
@classmethod
@@ -883,6 +906,9 @@ class RecraftRemoveBackgroundNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
expr="""{"type":"usd","usd":0.01}""",
),
)
@classmethod
@@ -929,6 +955,9 @@ class RecraftCrispUpscaleNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
expr="""{"type":"usd","usd":0.004}""",
),
)
@classmethod
@@ -972,6 +1001,9 @@ class RecraftCreativeUpscaleNode(RecraftCrispUpscaleNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
expr="""{"type":"usd","usd":0.25}""",
),
)

View File

@@ -241,6 +241,9 @@ class Rodin3D_Regular(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
expr="""{"type":"usd","usd":0.4}""",
),
)
@classmethod
@@ -294,6 +297,9 @@ class Rodin3D_Detail(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
expr="""{"type":"usd","usd":0.4}""",
),
)
@classmethod
@@ -347,6 +353,9 @@ class Rodin3D_Smooth(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
expr="""{"type":"usd","usd":0.4}""",
),
)
@classmethod
@@ -406,6 +415,9 @@ class Rodin3D_Sketch(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
expr="""{"type":"usd","usd":0.4}""",
),
)
@classmethod

View File

@@ -184,6 +184,10 @@ class RunwayImageToVideoNodeGen3a(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["duration"]),
expr="""{"type":"usd","usd": 0.0715 * widgets.duration}""",
),
)
@classmethod
@@ -274,6 +278,10 @@ class RunwayImageToVideoNodeGen4(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["duration"]),
expr="""{"type":"usd","usd": 0.0715 * widgets.duration}""",
),
)
@classmethod
@@ -372,6 +380,10 @@ class RunwayFirstLastFrameNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["duration"]),
expr="""{"type":"usd","usd": 0.0715 * widgets.duration}""",
),
)
@classmethod
@@ -457,6 +469,9 @@ class RunwayTextToImageNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
expr="""{"type":"usd","usd":0.11}""",
),
)
@classmethod

View File

@@ -89,6 +89,24 @@ class OpenAIVideoSora2(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["model", "size", "duration"]),
expr="""
(
$m := widgets.model;
$size := widgets.size;
$dur := widgets.duration;
$isPro := $contains($m, "sora-2-pro");
$isSora2 := $contains($m, "sora-2");
$isProSize := ($size = "1024x1792" or $size = "1792x1024");
$perSec :=
$isPro ? ($isProSize ? 0.5 : 0.3) :
$isSora2 ? 0.1 :
($isProSize ? 0.5 : 0.1);
{"type":"usd","usd": $round($perSec * $dur, 2)}
)
""",
),
)
@classmethod

View File

@@ -127,6 +127,9 @@ class StabilityStableImageUltraNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
expr="""{"type":"usd","usd":0.08}""",
),
)
@classmethod
@@ -264,6 +267,16 @@ class StabilityStableImageSD_3_5Node(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["model"]),
expr="""
(
$contains(widgets.model,"large")
? {"type":"usd","usd":0.065}
: {"type":"usd","usd":0.035}
)
""",
),
)
@classmethod
@@ -382,6 +395,9 @@ class StabilityUpscaleConservativeNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
expr="""{"type":"usd","usd":0.25}""",
),
)
@classmethod
@@ -486,6 +502,9 @@ class StabilityUpscaleCreativeNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
expr="""{"type":"usd","usd":0.25}""",
),
)
@classmethod
@@ -566,6 +585,9 @@ class StabilityUpscaleFastNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
expr="""{"type":"usd","usd":0.01}""",
),
)
@classmethod
@@ -648,6 +670,9 @@ class StabilityTextToAudio(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
expr="""{"type":"usd","usd":0.2}""",
),
)
@classmethod
@@ -732,6 +757,9 @@ class StabilityAudioToAudio(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
expr="""{"type":"usd","usd":0.2}""",
),
)
@classmethod
@@ -828,6 +856,9 @@ class StabilityAudioInpaint(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
expr="""{"type":"usd","usd":0.2}""",
),
)
@classmethod

View File

@@ -2,7 +2,6 @@ import builtins
from io import BytesIO
import aiohttp
import torch
from typing_extensions import override
from comfy_api.latest import IO, ComfyExtension, Input
@@ -138,7 +137,7 @@ class TopazImageEnhance(IO.ComfyNode):
async def execute(
cls,
model: str,
image: torch.Tensor,
image: Input.Image,
prompt: str = "",
subject_detection: str = "All",
face_enhancement: bool = True,
@@ -153,7 +152,9 @@ class TopazImageEnhance(IO.ComfyNode):
) -> IO.NodeOutput:
if get_number_of_images(image) != 1:
raise ValueError("Only one input image is supported.")
download_url = await upload_images_to_comfyapi(cls, image, max_images=1, mime_type="image/png")
download_url = await upload_images_to_comfyapi(
cls, image, max_images=1, mime_type="image/png", total_pixels=4096*4096
)
initial_response = await sync_op(
cls,
ApiEndpoint(path="/proxy/topaz/image/v1/enhance-gen/async", method="POST"),

View File

@@ -117,6 +117,38 @@ class TripoTextToModelNode(IO.ComfyNode):
],
is_api_node=True,
is_output_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(
widgets=[
"model_version",
"style",
"texture",
"pbr",
"quad",
"texture_quality",
"geometry_quality",
],
),
expr="""
(
$isV14 := $contains(widgets.model_version,"v1.4");
$style := widgets.style;
$hasStyle := ($style != "" and $style != "none");
$withTexture := widgets.texture or widgets.pbr;
$isHdTexture := (widgets.texture_quality = "detailed");
$isDetailedGeometry := (widgets.geometry_quality = "detailed");
$baseCredits :=
$isV14 ? 20 : ($withTexture ? 20 : 10);
$credits :=
$baseCredits
+ ($hasStyle ? 5 : 0)
+ (widgets.quad ? 5 : 0)
+ ($isHdTexture ? 10 : 0)
+ ($isDetailedGeometry ? 20 : 0);
{"type":"usd","usd": $round($credits * 0.01, 2)}
)
""",
),
)
@classmethod
@@ -210,6 +242,38 @@ class TripoImageToModelNode(IO.ComfyNode):
],
is_api_node=True,
is_output_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(
widgets=[
"model_version",
"style",
"texture",
"pbr",
"quad",
"texture_quality",
"geometry_quality",
],
),
expr="""
(
$isV14 := $contains(widgets.model_version,"v1.4");
$style := widgets.style;
$hasStyle := ($style != "" and $style != "none");
$withTexture := widgets.texture or widgets.pbr;
$isHdTexture := (widgets.texture_quality = "detailed");
$isDetailedGeometry := (widgets.geometry_quality = "detailed");
$baseCredits :=
$isV14 ? 30 : ($withTexture ? 30 : 20);
$credits :=
$baseCredits
+ ($hasStyle ? 5 : 0)
+ (widgets.quad ? 5 : 0)
+ ($isHdTexture ? 10 : 0)
+ ($isDetailedGeometry ? 20 : 0);
{"type":"usd","usd": $round($credits * 0.01, 2)}
)
""",
),
)
@classmethod
@@ -314,6 +378,34 @@ class TripoMultiviewToModelNode(IO.ComfyNode):
],
is_api_node=True,
is_output_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(
widgets=[
"model_version",
"texture",
"pbr",
"quad",
"texture_quality",
"geometry_quality",
],
),
expr="""
(
$isV14 := $contains(widgets.model_version,"v1.4");
$withTexture := widgets.texture or widgets.pbr;
$isHdTexture := (widgets.texture_quality = "detailed");
$isDetailedGeometry := (widgets.geometry_quality = "detailed");
$baseCredits :=
$isV14 ? 30 : ($withTexture ? 30 : 20);
$credits :=
$baseCredits
+ (widgets.quad ? 5 : 0)
+ ($isHdTexture ? 10 : 0)
+ ($isDetailedGeometry ? 20 : 0);
{"type":"usd","usd": $round($credits * 0.01, 2)}
)
""",
),
)
@classmethod
@@ -405,6 +497,15 @@ class TripoTextureNode(IO.ComfyNode):
],
is_api_node=True,
is_output_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["texture_quality"]),
expr="""
(
$tq := widgets.texture_quality;
{"type":"usd","usd": ($contains($tq,"detailed") ? 0.2 : 0.1)}
)
""",
),
)
@classmethod
@@ -456,6 +557,9 @@ class TripoRefineNode(IO.ComfyNode):
],
is_api_node=True,
is_output_node=True,
price_badge=IO.PriceBadge(
expr="""{"type":"usd","usd":0.3}""",
),
)
@classmethod
@@ -489,6 +593,9 @@ class TripoRigNode(IO.ComfyNode):
],
is_api_node=True,
is_output_node=True,
price_badge=IO.PriceBadge(
expr="""{"type":"usd","usd":0.25}""",
),
)
@classmethod
@@ -545,6 +652,9 @@ class TripoRetargetNode(IO.ComfyNode):
],
is_api_node=True,
is_output_node=True,
price_badge=IO.PriceBadge(
expr="""{"type":"usd","usd":0.1}""",
),
)
@classmethod
@@ -638,6 +748,60 @@ class TripoConversionNode(IO.ComfyNode):
],
is_api_node=True,
is_output_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(
widgets=[
"quad",
"face_limit",
"texture_size",
"texture_format",
"force_symmetry",
"flatten_bottom",
"flatten_bottom_threshold",
"pivot_to_center_bottom",
"scale_factor",
"with_animation",
"pack_uv",
"bake",
"part_names",
"fbx_preset",
"export_vertex_colors",
"export_orientation",
"animate_in_place",
],
),
expr="""
(
$face := (widgets.face_limit != null) ? widgets.face_limit : -1;
$texSize := (widgets.texture_size != null) ? widgets.texture_size : 4096;
$flatThresh := (widgets.flatten_bottom_threshold != null) ? widgets.flatten_bottom_threshold : 0;
$scale := (widgets.scale_factor != null) ? widgets.scale_factor : 1;
$texFmt := (widgets.texture_format != "" ? widgets.texture_format : "jpeg");
$part := widgets.part_names;
$fbx := (widgets.fbx_preset != "" ? widgets.fbx_preset : "blender");
$orient := (widgets.export_orientation != "" ? widgets.export_orientation : "default");
$advanced :=
widgets.quad or
widgets.force_symmetry or
widgets.flatten_bottom or
widgets.pivot_to_center_bottom or
widgets.with_animation or
widgets.pack_uv or
widgets.bake or
widgets.export_vertex_colors or
widgets.animate_in_place or
($face != -1) or
($texSize != 4096) or
($flatThresh != 0) or
($scale != 1) or
($texFmt != "jpeg") or
($part != "") or
($fbx != "blender") or
($orient != "default");
{"type":"usd","usd": ($advanced ? 0.1 : 0.05)}
)
""",
),
)
@classmethod

View File

@@ -122,6 +122,10 @@ class VeoVideoGenerationNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["duration_seconds"]),
expr="""{"type":"usd","usd": 0.5 * widgets.duration_seconds}""",
),
)
@classmethod
@@ -347,6 +351,20 @@ class Veo3VideoGenerationNode(VeoVideoGenerationNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["model", "generate_audio"]),
expr="""
(
$m := widgets.model;
$a := widgets.generate_audio;
($contains($m,"veo-3.0-fast-generate-001") or $contains($m,"veo-3.1-fast-generate"))
? {"type":"usd","usd": ($a ? 1.2 : 0.8)}
: ($contains($m,"veo-3.0-generate-001") or $contains($m,"veo-3.1-generate"))
? {"type":"usd","usd": ($a ? 3.2 : 1.6)}
: {"type":"range_usd","min_usd":0.8,"max_usd":3.2}
)
""",
),
)
@@ -420,6 +438,30 @@ class Veo3FirstLastFrameNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["model", "generate_audio", "duration"]),
expr="""
(
$prices := {
"veo-3.1-fast-generate": { "audio": 0.15, "no_audio": 0.10 },
"veo-3.1-generate": { "audio": 0.40, "no_audio": 0.20 }
};
$m := widgets.model;
$ga := (widgets.generate_audio = "true");
$seconds := widgets.duration;
$modelKey :=
$contains($m, "veo-3.1-fast-generate") ? "veo-3.1-fast-generate" :
$contains($m, "veo-3.1-generate") ? "veo-3.1-generate" :
"";
$audioKey := $ga ? "audio" : "no_audio";
$modelPrices := $lookup($prices, $modelKey);
$pps := $lookup($modelPrices, $audioKey);
($pps != null)
? {"type":"usd","usd": $pps * $seconds}
: {"type":"range_usd","min_usd": 0.4, "max_usd": 3.2}
)
""",
),
)
@classmethod

View File

@@ -1,12 +1,13 @@
import logging
from enum import Enum
from typing import Literal, Optional, TypeVar
import torch
from pydantic import BaseModel, Field
from typing_extensions import override
from comfy_api.latest import IO, ComfyExtension
from comfy_api.latest import IO, ComfyExtension, Input
from comfy_api_nodes.apis.vidu import (
SubjectReference,
TaskCreationRequest,
TaskCreationResponse,
TaskResult,
TaskStatusResponse,
)
from comfy_api_nodes.util import (
ApiEndpoint,
download_url_to_video_output,
@@ -17,6 +18,7 @@ from comfy_api_nodes.util import (
validate_image_aspect_ratio,
validate_image_dimensions,
validate_images_aspect_ratio_closeness,
validate_string,
)
VIDU_TEXT_TO_VIDEO = "/proxy/vidu/text2video"
@@ -25,98 +27,33 @@ VIDU_REFERENCE_VIDEO = "/proxy/vidu/reference2video"
VIDU_START_END_VIDEO = "/proxy/vidu/start-end2video"
VIDU_GET_GENERATION_STATUS = "/proxy/vidu/tasks/%s/creations"
R = TypeVar("R")
class VideoModelName(str, Enum):
vidu_q1 = "viduq1"
class AspectRatio(str, Enum):
r_16_9 = "16:9"
r_9_16 = "9:16"
r_1_1 = "1:1"
class Resolution(str, Enum):
r_1080p = "1080p"
class MovementAmplitude(str, Enum):
auto = "auto"
small = "small"
medium = "medium"
large = "large"
class TaskCreationRequest(BaseModel):
model: VideoModelName = VideoModelName.vidu_q1
prompt: Optional[str] = Field(None, max_length=1500)
duration: Optional[Literal[5]] = 5
seed: Optional[int] = Field(0, ge=0, le=2147483647)
aspect_ratio: Optional[AspectRatio] = AspectRatio.r_16_9
resolution: Optional[Resolution] = Resolution.r_1080p
movement_amplitude: Optional[MovementAmplitude] = MovementAmplitude.auto
images: Optional[list[str]] = Field(None, description="Base64 encoded string or image URL")
class TaskCreationResponse(BaseModel):
task_id: str = Field(...)
state: str = Field(...)
created_at: str = Field(...)
code: Optional[int] = Field(None, description="Error code")
class TaskResult(BaseModel):
id: str = Field(..., description="Creation id")
url: str = Field(..., description="The URL of the generated results, valid for one hour")
cover_url: str = Field(..., description="The cover URL of the generated results, valid for one hour")
class TaskStatusResponse(BaseModel):
state: str = Field(...)
err_code: Optional[str] = Field(None)
creations: list[TaskResult] = Field(..., description="Generated results")
def get_video_url_from_response(response) -> Optional[str]:
if response.creations:
return response.creations[0].url
return None
def get_video_from_response(response) -> TaskResult:
if not response.creations:
error_msg = f"Vidu request does not contain results. State: {response.state}, Error Code: {response.err_code}"
logging.info(error_msg)
raise RuntimeError(error_msg)
logging.info("Vidu task %s succeeded. Video URL: %s", response.creations[0].id, response.creations[0].url)
return response.creations[0]
async def execute_task(
cls: type[IO.ComfyNode],
vidu_endpoint: str,
payload: TaskCreationRequest,
estimated_duration: int,
) -> R:
response = await sync_op(
) -> list[TaskResult]:
task_creation_response = await sync_op(
cls,
endpoint=ApiEndpoint(path=vidu_endpoint, method="POST"),
response_model=TaskCreationResponse,
data=payload,
)
if response.state == "failed":
error_msg = f"Vidu request failed. Code: {response.code}"
logging.error(error_msg)
raise RuntimeError(error_msg)
return await poll_op(
if task_creation_response.state == "failed":
raise RuntimeError(f"Vidu request failed. Code: {task_creation_response.code}")
response = await poll_op(
cls,
ApiEndpoint(path=VIDU_GET_GENERATION_STATUS % response.task_id),
ApiEndpoint(path=VIDU_GET_GENERATION_STATUS % task_creation_response.task_id),
response_model=TaskStatusResponse,
status_extractor=lambda r: r.state,
estimated_duration=estimated_duration,
progress_extractor=lambda r: r.progress,
max_poll_attempts=320,
)
if not response.creations:
raise RuntimeError(
f"Vidu request does not contain results. State: {response.state}, Error Code: {response.err_code}"
)
return response.creations
class ViduTextToVideoNode(IO.ComfyNode):
@@ -127,14 +64,9 @@ class ViduTextToVideoNode(IO.ComfyNode):
node_id="ViduTextToVideoNode",
display_name="Vidu Text To Video Generation",
category="api node/video/Vidu",
description="Generate video from text prompt",
description="Generate video from a text prompt",
inputs=[
IO.Combo.Input(
"model",
options=VideoModelName,
default=VideoModelName.vidu_q1,
tooltip="Model name",
),
IO.Combo.Input("model", options=["viduq1"], tooltip="Model name"),
IO.String.Input(
"prompt",
multiline=True,
@@ -163,22 +95,19 @@ class ViduTextToVideoNode(IO.ComfyNode):
),
IO.Combo.Input(
"aspect_ratio",
options=AspectRatio,
default=AspectRatio.r_16_9,
options=["16:9", "9:16", "1:1"],
tooltip="The aspect ratio of the output video",
optional=True,
),
IO.Combo.Input(
"resolution",
options=Resolution,
default=Resolution.r_1080p,
options=["1080p"],
tooltip="Supported values may vary by model & duration",
optional=True,
),
IO.Combo.Input(
"movement_amplitude",
options=MovementAmplitude,
default=MovementAmplitude.auto,
options=["auto", "small", "medium", "large"],
tooltip="The movement amplitude of objects in the frame",
optional=True,
),
@@ -192,6 +121,9 @@ class ViduTextToVideoNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
expr="""{"type":"usd","usd":0.4}""",
),
)
@classmethod
@@ -208,7 +140,7 @@ class ViduTextToVideoNode(IO.ComfyNode):
if not prompt:
raise ValueError("The prompt field is required and cannot be empty.")
payload = TaskCreationRequest(
model_name=model,
model=model,
prompt=prompt,
duration=duration,
seed=seed,
@@ -216,8 +148,8 @@ class ViduTextToVideoNode(IO.ComfyNode):
resolution=resolution,
movement_amplitude=movement_amplitude,
)
results = await execute_task(cls, VIDU_TEXT_TO_VIDEO, payload, 320)
return IO.NodeOutput(await download_url_to_video_output(get_video_from_response(results).url))
results = await execute_task(cls, VIDU_TEXT_TO_VIDEO, payload)
return IO.NodeOutput(await download_url_to_video_output(results[0].url))
class ViduImageToVideoNode(IO.ComfyNode):
@@ -230,12 +162,7 @@ class ViduImageToVideoNode(IO.ComfyNode):
category="api node/video/Vidu",
description="Generate video from image and optional prompt",
inputs=[
IO.Combo.Input(
"model",
options=VideoModelName,
default=VideoModelName.vidu_q1,
tooltip="Model name",
),
IO.Combo.Input("model", options=["viduq1"], tooltip="Model name"),
IO.Image.Input(
"image",
tooltip="An image to be used as the start frame of the generated video",
@@ -270,15 +197,13 @@ class ViduImageToVideoNode(IO.ComfyNode):
),
IO.Combo.Input(
"resolution",
options=Resolution,
default=Resolution.r_1080p,
options=["1080p"],
tooltip="Supported values may vary by model & duration",
optional=True,
),
IO.Combo.Input(
"movement_amplitude",
options=MovementAmplitude,
default=MovementAmplitude.auto.value,
options=["auto", "small", "medium", "large"],
tooltip="The movement amplitude of objects in the frame",
optional=True,
),
@@ -292,13 +217,16 @@ class ViduImageToVideoNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
expr="""{"type":"usd","usd":0.4}""",
),
)
@classmethod
async def execute(
cls,
model: str,
image: torch.Tensor,
image: Input.Image,
prompt: str,
duration: int,
seed: int,
@@ -309,7 +237,7 @@ class ViduImageToVideoNode(IO.ComfyNode):
raise ValueError("Only one input image is allowed.")
validate_image_aspect_ratio(image, (1, 4), (4, 1))
payload = TaskCreationRequest(
model_name=model,
model=model,
prompt=prompt,
duration=duration,
seed=seed,
@@ -322,8 +250,8 @@ class ViduImageToVideoNode(IO.ComfyNode):
max_images=1,
mime_type="image/png",
)
results = await execute_task(cls, VIDU_IMAGE_TO_VIDEO, payload, 120)
return IO.NodeOutput(await download_url_to_video_output(get_video_from_response(results).url))
results = await execute_task(cls, VIDU_IMAGE_TO_VIDEO, payload)
return IO.NodeOutput(await download_url_to_video_output(results[0].url))
class ViduReferenceVideoNode(IO.ComfyNode):
@@ -334,14 +262,9 @@ class ViduReferenceVideoNode(IO.ComfyNode):
node_id="ViduReferenceVideoNode",
display_name="Vidu Reference To Video Generation",
category="api node/video/Vidu",
description="Generate video from multiple images and prompt",
description="Generate video from multiple images and a prompt",
inputs=[
IO.Combo.Input(
"model",
options=VideoModelName,
default=VideoModelName.vidu_q1,
tooltip="Model name",
),
IO.Combo.Input("model", options=["viduq1"], tooltip="Model name"),
IO.Image.Input(
"images",
tooltip="Images to use as references to generate a video with consistent subjects (max 7 images).",
@@ -374,22 +297,19 @@ class ViduReferenceVideoNode(IO.ComfyNode):
),
IO.Combo.Input(
"aspect_ratio",
options=AspectRatio,
default=AspectRatio.r_16_9,
options=["16:9", "9:16", "1:1"],
tooltip="The aspect ratio of the output video",
optional=True,
),
IO.Combo.Input(
"resolution",
options=[model.value for model in Resolution],
default=Resolution.r_1080p.value,
options=["1080p"],
tooltip="Supported values may vary by model & duration",
optional=True,
),
IO.Combo.Input(
"movement_amplitude",
options=[model.value for model in MovementAmplitude],
default=MovementAmplitude.auto.value,
options=["auto", "small", "medium", "large"],
tooltip="The movement amplitude of objects in the frame",
optional=True,
),
@@ -403,13 +323,16 @@ class ViduReferenceVideoNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
expr="""{"type":"usd","usd":0.4}""",
),
)
@classmethod
async def execute(
cls,
model: str,
images: torch.Tensor,
images: Input.Image,
prompt: str,
duration: int,
seed: int,
@@ -426,7 +349,7 @@ class ViduReferenceVideoNode(IO.ComfyNode):
validate_image_aspect_ratio(image, (1, 4), (4, 1))
validate_image_dimensions(image, min_width=128, min_height=128)
payload = TaskCreationRequest(
model_name=model,
model=model,
prompt=prompt,
duration=duration,
seed=seed,
@@ -440,8 +363,8 @@ class ViduReferenceVideoNode(IO.ComfyNode):
max_images=7,
mime_type="image/png",
)
results = await execute_task(cls, VIDU_REFERENCE_VIDEO, payload, 120)
return IO.NodeOutput(await download_url_to_video_output(get_video_from_response(results).url))
results = await execute_task(cls, VIDU_REFERENCE_VIDEO, payload)
return IO.NodeOutput(await download_url_to_video_output(results[0].url))
class ViduStartEndToVideoNode(IO.ComfyNode):
@@ -454,12 +377,7 @@ class ViduStartEndToVideoNode(IO.ComfyNode):
category="api node/video/Vidu",
description="Generate a video from start and end frames and a prompt",
inputs=[
IO.Combo.Input(
"model",
options=[model.value for model in VideoModelName],
default=VideoModelName.vidu_q1.value,
tooltip="Model name",
),
IO.Combo.Input("model", options=["viduq1"], tooltip="Model name"),
IO.Image.Input(
"first_frame",
tooltip="Start frame",
@@ -497,15 +415,13 @@ class ViduStartEndToVideoNode(IO.ComfyNode):
),
IO.Combo.Input(
"resolution",
options=[model.value for model in Resolution],
default=Resolution.r_1080p.value,
options=["1080p"],
tooltip="Supported values may vary by model & duration",
optional=True,
),
IO.Combo.Input(
"movement_amplitude",
options=[model.value for model in MovementAmplitude],
default=MovementAmplitude.auto.value,
options=["auto", "small", "medium", "large"],
tooltip="The movement amplitude of objects in the frame",
optional=True,
),
@@ -519,14 +435,17 @@ class ViduStartEndToVideoNode(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
expr="""{"type":"usd","usd":0.4}""",
),
)
@classmethod
async def execute(
cls,
model: str,
first_frame: torch.Tensor,
end_frame: torch.Tensor,
first_frame: Input.Image,
end_frame: Input.Image,
prompt: str,
duration: int,
seed: int,
@@ -535,7 +454,7 @@ class ViduStartEndToVideoNode(IO.ComfyNode):
) -> IO.NodeOutput:
validate_images_aspect_ratio_closeness(first_frame, end_frame, min_rel=0.8, max_rel=1.25, strict=False)
payload = TaskCreationRequest(
model_name=model,
model=model,
prompt=prompt,
duration=duration,
seed=seed,
@@ -546,8 +465,479 @@ class ViduStartEndToVideoNode(IO.ComfyNode):
(await upload_images_to_comfyapi(cls, frame, max_images=1, mime_type="image/png"))[0]
for frame in (first_frame, end_frame)
]
results = await execute_task(cls, VIDU_START_END_VIDEO, payload, 96)
return IO.NodeOutput(await download_url_to_video_output(get_video_from_response(results).url))
results = await execute_task(cls, VIDU_START_END_VIDEO, payload)
return IO.NodeOutput(await download_url_to_video_output(results[0].url))
class Vidu2TextToVideoNode(IO.ComfyNode):
@classmethod
def define_schema(cls):
return IO.Schema(
node_id="Vidu2TextToVideoNode",
display_name="Vidu2 Text-to-Video Generation",
category="api node/video/Vidu",
description="Generate video from a text prompt",
inputs=[
IO.Combo.Input("model", options=["viduq2"]),
IO.String.Input(
"prompt",
multiline=True,
tooltip="A textual description for video generation, with a maximum length of 2000 characters.",
),
IO.Int.Input(
"duration",
default=5,
min=1,
max=10,
step=1,
display_mode=IO.NumberDisplay.slider,
),
IO.Int.Input(
"seed",
default=1,
min=0,
max=2147483647,
step=1,
display_mode=IO.NumberDisplay.number,
control_after_generate=True,
),
IO.Combo.Input("aspect_ratio", options=["16:9", "9:16", "3:4", "4:3", "1:1"]),
IO.Combo.Input("resolution", options=["720p", "1080p"]),
IO.Boolean.Input(
"background_music",
default=False,
tooltip="Whether to add background music to the generated video.",
),
],
outputs=[
IO.Video.Output(),
],
hidden=[
IO.Hidden.auth_token_comfy_org,
IO.Hidden.api_key_comfy_org,
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["duration", "resolution"]),
expr="""
(
$is1080 := widgets.resolution = "1080p";
$base := $is1080 ? 0.1 : 0.075;
$perSec := $is1080 ? 0.05 : 0.025;
{"type":"usd","usd": $base + $perSec * (widgets.duration - 1)}
)
""",
),
)
@classmethod
async def execute(
cls,
model: str,
prompt: str,
duration: int,
seed: int,
aspect_ratio: str,
resolution: str,
background_music: bool,
) -> IO.NodeOutput:
validate_string(prompt, min_length=1, max_length=2000)
results = await execute_task(
cls,
VIDU_TEXT_TO_VIDEO,
TaskCreationRequest(
model=model,
prompt=prompt,
duration=duration,
seed=seed,
aspect_ratio=aspect_ratio,
resolution=resolution,
bgm=background_music,
),
)
return IO.NodeOutput(await download_url_to_video_output(results[0].url))
class Vidu2ImageToVideoNode(IO.ComfyNode):
@classmethod
def define_schema(cls):
return IO.Schema(
node_id="Vidu2ImageToVideoNode",
display_name="Vidu2 Image-to-Video Generation",
category="api node/video/Vidu",
description="Generate a video from an image and an optional prompt.",
inputs=[
IO.Combo.Input("model", options=["viduq2-pro-fast", "viduq2-pro", "viduq2-turbo"]),
IO.Image.Input(
"image",
tooltip="An image to be used as the start frame of the generated video.",
),
IO.String.Input(
"prompt",
multiline=True,
default="",
tooltip="An optional text prompt for video generation (max 2000 characters).",
),
IO.Int.Input(
"duration",
default=5,
min=1,
max=10,
step=1,
display_mode=IO.NumberDisplay.slider,
),
IO.Int.Input(
"seed",
default=1,
min=0,
max=2147483647,
step=1,
display_mode=IO.NumberDisplay.number,
control_after_generate=True,
),
IO.Combo.Input(
"resolution",
options=["720p", "1080p"],
),
IO.Combo.Input(
"movement_amplitude",
options=["auto", "small", "medium", "large"],
tooltip="The movement amplitude of objects in the frame.",
),
],
outputs=[
IO.Video.Output(),
],
hidden=[
IO.Hidden.auth_token_comfy_org,
IO.Hidden.api_key_comfy_org,
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["model", "duration", "resolution"]),
expr="""
(
$m := widgets.model;
$d := widgets.duration;
$is1080 := widgets.resolution = "1080p";
$contains($m, "pro-fast")
? (
$base := $is1080 ? 0.08 : 0.04;
$perSec := $is1080 ? 0.02 : 0.01;
{"type":"usd","usd": $base + $perSec * ($d - 1)}
)
: $contains($m, "pro")
? (
$base := $is1080 ? 0.275 : 0.075;
$perSec := $is1080 ? 0.075 : 0.05;
{"type":"usd","usd": $base + $perSec * ($d - 1)}
)
: $contains($m, "turbo")
? (
$is1080
? {"type":"usd","usd": 0.175 + 0.05 * ($d - 1)}
: (
$d <= 1 ? {"type":"usd","usd": 0.04}
: $d <= 2 ? {"type":"usd","usd": 0.05}
: {"type":"usd","usd": 0.05 + 0.05 * ($d - 2)}
)
)
: {"type":"usd","usd": 0.04}
)
""",
),
)
@classmethod
async def execute(
cls,
model: str,
image: Input.Image,
prompt: str,
duration: int,
seed: int,
resolution: str,
movement_amplitude: str,
) -> IO.NodeOutput:
if get_number_of_images(image) > 1:
raise ValueError("Only one input image is allowed.")
validate_image_aspect_ratio(image, (1, 4), (4, 1))
validate_string(prompt, max_length=2000)
results = await execute_task(
cls,
VIDU_IMAGE_TO_VIDEO,
TaskCreationRequest(
model=model,
prompt=prompt,
duration=duration,
seed=seed,
resolution=resolution,
movement_amplitude=movement_amplitude,
images=await upload_images_to_comfyapi(
cls,
image,
max_images=1,
mime_type="image/png",
),
),
)
return IO.NodeOutput(await download_url_to_video_output(results[0].url))
class Vidu2ReferenceVideoNode(IO.ComfyNode):
@classmethod
def define_schema(cls):
return IO.Schema(
node_id="Vidu2ReferenceVideoNode",
display_name="Vidu2 Reference-to-Video Generation",
category="api node/video/Vidu",
description="Generate a video from multiple reference images and a prompt.",
inputs=[
IO.Combo.Input("model", options=["viduq2"]),
IO.Autogrow.Input(
"subjects",
template=IO.Autogrow.TemplateNames(
IO.Image.Input("reference_images"),
names=["subject1", "subject2", "subject3"],
min=1,
),
tooltip="For each subject, provide up to 3 reference images (7 images total across all subjects). "
"Reference them in prompts via @subject{subject_id}.",
),
IO.String.Input(
"prompt",
multiline=True,
tooltip="When enabled, the video will include generated speech and background music "
"based on the prompt.",
),
IO.Boolean.Input(
"audio",
default=False,
tooltip="When enabled video will contain generated speech and background music based on the prompt.",
),
IO.Int.Input(
"duration",
default=5,
min=1,
max=10,
step=1,
display_mode=IO.NumberDisplay.slider,
),
IO.Int.Input(
"seed",
default=1,
min=0,
max=2147483647,
step=1,
display_mode=IO.NumberDisplay.number,
control_after_generate=True,
),
IO.Combo.Input("aspect_ratio", options=["16:9", "9:16", "4:3", "3:4", "1:1"]),
IO.Combo.Input("resolution", options=["720p"]),
IO.Combo.Input(
"movement_amplitude",
options=["auto", "small", "medium", "large"],
tooltip="The movement amplitude of objects in the frame.",
),
],
outputs=[
IO.Video.Output(),
],
hidden=[
IO.Hidden.auth_token_comfy_org,
IO.Hidden.api_key_comfy_org,
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["audio", "duration", "resolution"]),
expr="""
(
$is1080 := widgets.resolution = "1080p";
$base := $is1080 ? 0.375 : 0.125;
$perSec := $is1080 ? 0.05 : 0.025;
$audioCost := widgets.audio = true ? 0.075 : 0;
{"type":"usd","usd": $base + $perSec * (widgets.duration - 1) + $audioCost}
)
""",
),
)
@classmethod
async def execute(
cls,
model: str,
subjects: IO.Autogrow.Type,
prompt: str,
audio: bool,
duration: int,
seed: int,
aspect_ratio: str,
resolution: str,
movement_amplitude: str,
) -> IO.NodeOutput:
validate_string(prompt, min_length=1, max_length=2000)
total_images = 0
for i in subjects:
if get_number_of_images(subjects[i]) > 3:
raise ValueError("Maximum number of images per subject is 3.")
for im in subjects[i]:
total_images += 1
validate_image_aspect_ratio(im, (1, 4), (4, 1))
validate_image_dimensions(im, min_width=128, min_height=128)
if total_images > 7:
raise ValueError("Too many reference images; the maximum allowed is 7.")
subjects_param: list[SubjectReference] = []
for i in subjects:
subjects_param.append(
SubjectReference(
id=i,
images=await upload_images_to_comfyapi(
cls,
subjects[i],
max_images=3,
mime_type="image/png",
wait_label=f"Uploading reference images for {i}",
),
),
)
payload = TaskCreationRequest(
model=model,
prompt=prompt,
audio=audio,
duration=duration,
seed=seed,
aspect_ratio=aspect_ratio,
resolution=resolution,
movement_amplitude=movement_amplitude,
subjects=subjects_param,
)
results = await execute_task(cls, VIDU_REFERENCE_VIDEO, payload)
return IO.NodeOutput(await download_url_to_video_output(results[0].url))
class Vidu2StartEndToVideoNode(IO.ComfyNode):
@classmethod
def define_schema(cls):
return IO.Schema(
node_id="Vidu2StartEndToVideoNode",
display_name="Vidu2 Start/End Frame-to-Video Generation",
category="api node/video/Vidu",
description="Generate a video from a start frame, an end frame, and a prompt.",
inputs=[
IO.Combo.Input("model", options=["viduq2-pro-fast", "viduq2-pro", "viduq2-turbo"]),
IO.Image.Input("first_frame"),
IO.Image.Input("end_frame"),
IO.String.Input(
"prompt",
multiline=True,
tooltip="Prompt description (max 2000 characters).",
),
IO.Int.Input(
"duration",
default=5,
min=2,
max=8,
step=1,
display_mode=IO.NumberDisplay.slider,
),
IO.Int.Input(
"seed",
default=1,
min=0,
max=2147483647,
step=1,
display_mode=IO.NumberDisplay.number,
control_after_generate=True,
),
IO.Combo.Input("resolution", options=["720p", "1080p"]),
IO.Combo.Input(
"movement_amplitude",
options=["auto", "small", "medium", "large"],
tooltip="The movement amplitude of objects in the frame.",
),
],
outputs=[
IO.Video.Output(),
],
hidden=[
IO.Hidden.auth_token_comfy_org,
IO.Hidden.api_key_comfy_org,
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["model", "duration", "resolution"]),
expr="""
(
$m := widgets.model;
$d := widgets.duration;
$is1080 := widgets.resolution = "1080p";
$contains($m, "pro-fast")
? (
$base := $is1080 ? 0.08 : 0.04;
$perSec := $is1080 ? 0.02 : 0.01;
{"type":"usd","usd": $base + $perSec * ($d - 1)}
)
: $contains($m, "pro")
? (
$base := $is1080 ? 0.275 : 0.075;
$perSec := $is1080 ? 0.075 : 0.05;
{"type":"usd","usd": $base + $perSec * ($d - 1)}
)
: $contains($m, "turbo")
? (
$is1080
? {"type":"usd","usd": 0.175 + 0.05 * ($d - 1)}
: (
$d <= 2 ? {"type":"usd","usd": 0.05}
: {"type":"usd","usd": 0.05 + 0.05 * ($d - 2)}
)
)
: {"type":"usd","usd": 0.04}
)
""",
),
)
@classmethod
async def execute(
cls,
model: str,
first_frame: Input.Image,
end_frame: Input.Image,
prompt: str,
duration: int,
seed: int,
resolution: str,
movement_amplitude: str,
) -> IO.NodeOutput:
validate_string(prompt, max_length=2000)
if get_number_of_images(first_frame) > 1:
raise ValueError("Only one input image is allowed for `first_frame`.")
if get_number_of_images(end_frame) > 1:
raise ValueError("Only one input image is allowed for `end_frame`.")
validate_images_aspect_ratio_closeness(first_frame, end_frame, min_rel=0.8, max_rel=1.25, strict=False)
payload = TaskCreationRequest(
model=model,
prompt=prompt,
duration=duration,
seed=seed,
resolution=resolution,
movement_amplitude=movement_amplitude,
images=[
(await upload_images_to_comfyapi(cls, frame, max_images=1, mime_type="image/png"))[0]
for frame in (first_frame, end_frame)
],
)
results = await execute_task(cls, VIDU_START_END_VIDEO, payload)
return IO.NodeOutput(await download_url_to_video_output(results[0].url))
class ViduExtension(ComfyExtension):
@@ -558,6 +948,10 @@ class ViduExtension(ComfyExtension):
ViduImageToVideoNode,
ViduReferenceVideoNode,
ViduStartEndToVideoNode,
Vidu2TextToVideoNode,
Vidu2ImageToVideoNode,
Vidu2ReferenceVideoNode,
Vidu2StartEndToVideoNode,
]

View File

@@ -244,6 +244,9 @@ class WanTextToImageApi(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
expr="""{"type":"usd","usd":0.03}""",
),
)
@classmethod
@@ -363,6 +366,9 @@ class WanImageToImageApi(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
expr="""{"type":"usd","usd":0.03}""",
),
)
@classmethod
@@ -520,6 +526,17 @@ class WanTextToVideoApi(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["duration", "size"]),
expr="""
(
$ppsTable := { "480p": 0.05, "720p": 0.1, "1080p": 0.15 };
$resKey := $substringBefore(widgets.size, ":");
$pps := $lookup($ppsTable, $resKey);
{ "type": "usd", "usd": $round($pps * widgets.duration, 2) }
)
""",
),
)
@classmethod
@@ -681,6 +698,16 @@ class WanImageToVideoApi(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["duration", "resolution"]),
expr="""
(
$ppsTable := { "480p": 0.05, "720p": 0.1, "1080p": 0.15 };
$pps := $lookup($ppsTable, widgets.resolution);
{ "type": "usd", "usd": $round($pps * widgets.duration, 2) }
)
""",
),
)
@classmethod
@@ -828,6 +855,22 @@ class WanReferenceVideoApi(IO.ComfyNode):
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["size", "duration"]),
expr="""
(
$rate := $contains(widgets.size, "1080p") ? 0.15 : 0.10;
$inputMin := 2 * $rate;
$inputMax := 5 * $rate;
$outputPrice := widgets.duration * $rate;
{
"type": "range_usd",
"min_usd": $inputMin + $outputPrice,
"max_usd": $inputMax + $outputPrice
}
)
""",
),
)
@classmethod

View File

@@ -55,7 +55,7 @@ def image_tensor_pair_to_batch(image1: torch.Tensor, image2: torch.Tensor) -> to
def tensor_to_bytesio(
image: torch.Tensor,
name: str | None = None,
*,
total_pixels: int = 2048 * 2048,
mime_type: str = "image/png",
) -> BytesIO:
@@ -75,7 +75,7 @@ def tensor_to_bytesio(
pil_image = tensor_to_pil(image, total_pixels=total_pixels)
img_binary = pil_to_bytesio(pil_image, mime_type=mime_type)
img_binary.name = f"{name if name else uuid.uuid4()}.{mimetype_to_extension(mime_type)}"
img_binary.name = f"{uuid.uuid4()}.{mimetype_to_extension(mime_type)}"
return img_binary

View File

@@ -43,27 +43,41 @@ class UploadResponse(BaseModel):
async def upload_images_to_comfyapi(
cls: type[IO.ComfyNode],
image: torch.Tensor,
image: torch.Tensor | list[torch.Tensor],
*,
max_images: int = 8,
mime_type: str | None = None,
wait_label: str | None = "Uploading",
show_batch_index: bool = True,
total_pixels: int = 2048 * 2048,
) -> list[str]:
"""
Uploads images to ComfyUI API and returns download URLs.
To upload multiple images, stack them in the batch dimension first.
"""
tensors: list[torch.Tensor] = []
if isinstance(image, list):
for img in image:
is_batch = len(img.shape) > 3
if is_batch:
tensors.extend(img[i] for i in range(img.shape[0]))
else:
tensors.append(img)
else:
is_batch = len(image.shape) > 3
if is_batch:
tensors.extend(image[i] for i in range(image.shape[0]))
else:
tensors.append(image)
# if batched, try to upload each file if max_images is greater than 0
download_urls: list[str] = []
is_batch = len(image.shape) > 3
batch_len = image.shape[0] if is_batch else 1
num_to_upload = min(batch_len, max_images)
num_to_upload = min(len(tensors), max_images)
batch_start_ts = time.monotonic()
for idx in range(num_to_upload):
tensor = image[idx] if is_batch else image
img_io = tensor_to_bytesio(tensor, mime_type=mime_type)
tensor = tensors[idx]
img_io = tensor_to_bytesio(tensor, total_pixels=total_pixels, mime_type=mime_type)
effective_label = wait_label
if wait_label and show_batch_index and num_to_upload > 1:
@@ -81,7 +95,6 @@ async def upload_audio_to_comfyapi(
container_format: str = "mp4",
codec_name: str = "aac",
mime_type: str = "audio/mp4",
filename: str = "uploaded_audio.mp4",
) -> str:
"""
Uploads a single audio input to ComfyUI API and returns its download URL.
@@ -91,7 +104,7 @@ async def upload_audio_to_comfyapi(
waveform: torch.Tensor = audio["waveform"]
audio_data_np = audio_tensor_to_contiguous_ndarray(waveform)
audio_bytes_io = audio_ndarray_to_bytesio(audio_data_np, sample_rate, container_format, codec_name)
return await upload_file_to_comfyapi(cls, audio_bytes_io, filename, mime_type)
return await upload_file_to_comfyapi(cls, audio_bytes_io, f"{uuid.uuid4()}.{container_format}", mime_type)
async def upload_video_to_comfyapi(

View File

@@ -244,6 +244,10 @@ class ModelPatchLoader:
elif 'control_all_x_embedder.2-1.weight' in sd: # alipai z image fun controlnet
sd = z_image_convert(sd)
config = {}
if 'control_layers.4.adaLN_modulation.0.weight' not in sd:
config['n_control_layers'] = 3
config['additional_in_dim'] = 17
config['refiner_control'] = True
if 'control_layers.14.adaLN_modulation.0.weight' in sd:
config['n_control_layers'] = 15
config['additional_in_dim'] = 17

View File

@@ -254,6 +254,7 @@ class ResizeType(str, Enum):
SCALE_HEIGHT = "scale height"
SCALE_TOTAL_PIXELS = "scale total pixels"
MATCH_SIZE = "match size"
SCALE_TO_MULTIPLE = "scale to multiple"
def is_image(input: torch.Tensor) -> bool:
# images have 4 dimensions: [batch, height, width, channels]
@@ -328,7 +329,7 @@ def scale_shorter_dimension(input: torch.Tensor, shorter_size: int, scale_method
if height < width:
width = round((width / height) * shorter_size)
height = shorter_size
elif width > height:
elif width < height:
height = round((height / width) * shorter_size)
width = shorter_size
else:
@@ -363,6 +364,43 @@ def scale_match_size(input: torch.Tensor, match: torch.Tensor, scale_method: str
input = finalize_image_mask_input(input, is_type_image)
return input
def scale_to_multiple_cover(input: torch.Tensor, multiple: int, scale_method: str) -> torch.Tensor:
if multiple <= 1:
return input
is_type_image = is_image(input)
if is_type_image:
_, height, width, _ = input.shape
else:
_, height, width = input.shape
target_w = (width // multiple) * multiple
target_h = (height // multiple) * multiple
if target_w == 0 or target_h == 0:
return input
if target_w == width and target_h == height:
return input
s_w = target_w / width
s_h = target_h / height
if s_w >= s_h:
scaled_w = target_w
scaled_h = int(math.ceil(height * s_w))
if scaled_h < target_h:
scaled_h = target_h
else:
scaled_h = target_h
scaled_w = int(math.ceil(width * s_h))
if scaled_w < target_w:
scaled_w = target_w
input = init_image_mask_input(input, is_type_image)
input = comfy.utils.common_upscale(input, scaled_w, scaled_h, scale_method, "disabled")
input = finalize_image_mask_input(input, is_type_image)
x0 = (scaled_w - target_w) // 2
y0 = (scaled_h - target_h) // 2
x1 = x0 + target_w
y1 = y0 + target_h
if is_type_image:
return input[:, y0:y1, x0:x1, :]
return input[:, y0:y1, x0:x1]
class ResizeImageMaskNode(io.ComfyNode):
scale_methods = ["nearest-exact", "bilinear", "area", "bicubic", "lanczos"]
@@ -378,6 +416,7 @@ class ResizeImageMaskNode(io.ComfyNode):
longer_size: int
shorter_size: int
megapixels: float
multiple: int
@classmethod
def define_schema(cls):
@@ -417,6 +456,9 @@ class ResizeImageMaskNode(io.ComfyNode):
io.MultiType.Input("match", [io.Image, io.Mask]),
crop_combo,
]),
io.DynamicCombo.Option(ResizeType.SCALE_TO_MULTIPLE, [
io.Int.Input("multiple", default=8, min=1, max=MAX_RESOLUTION, step=1),
]),
]),
io.Combo.Input("scale_method", options=cls.scale_methods, default="area"),
],
@@ -442,6 +484,8 @@ class ResizeImageMaskNode(io.ComfyNode):
return io.NodeOutput(scale_total_pixels(input, resize_type["megapixels"], scale_method))
elif selected_type == ResizeType.MATCH_SIZE:
return io.NodeOutput(scale_match_size(input, resize_type["match"], scale_method, resize_type["crop"]))
elif selected_type == ResizeType.SCALE_TO_MULTIPLE:
return io.NodeOutput(scale_to_multiple_cover(input, resize_type["multiple"], scale_method))
raise ValueError(f"Unsupported resize type: {selected_type}")
def batch_images(images: list[torch.Tensor]) -> torch.Tensor | None:

View File

@@ -67,23 +67,139 @@ class SaveWEBM(io.ComfyNode):
class SaveVideo(io.ComfyNode):
@classmethod
def define_schema(cls):
# H264-specific inputs
h264_quality = io.Int.Input(
"quality",
default=80,
min=0,
max=100,
step=1,
display_name="Quality",
tooltip="Output quality (0-100). Higher = better quality, larger files. "
"Internally maps to CRF: 100→CRF 12, 50→CRF 23, 0→CRF 40.",
)
h264_speed = io.Combo.Input(
"speed",
options=Types.VideoSpeedPreset.as_input(),
default="auto",
display_name="Encoding Speed",
tooltip="Encoding speed preset. Slower = better compression at same quality. "
"Maps to FFmpeg presets: Fastest=ultrafast, Balanced=medium, Best=veryslow.",
)
h264_profile = io.Combo.Input(
"profile",
options=["auto", "baseline", "main", "high"],
default="auto",
display_name="Profile",
tooltip="H.264 profile. 'baseline' for max compatibility (older devices), "
"'main' for standard use, 'high' for best quality/compression.",
advanced=True,
)
h264_tune = io.Combo.Input(
"tune",
options=["auto", "film", "animation", "grain", "stillimage", "fastdecode", "zerolatency"],
default="auto",
display_name="Tune",
tooltip="Optimize encoding for specific content types. "
"'film' for live action, 'animation' for cartoons/anime, 'grain' to preserve film grain.",
advanced=True,
)
# VP9-specific inputs
vp9_quality = io.Int.Input(
"quality",
default=80,
min=0,
max=100,
step=1,
display_name="Quality",
tooltip="Output quality (0-100). Higher = better quality, larger files. "
"Internally maps to CRF: 100→CRF 15, 50→CRF 33, 0→CRF 50.",
)
vp9_speed = io.Combo.Input(
"speed",
options=Types.VideoSpeedPreset.as_input(),
default="auto",
display_name="Encoding Speed",
tooltip="Encoding speed. Slower = better compression. "
"Maps to VP9 cpu-used: Fastest=0, Balanced=2, Best=4.",
)
vp9_row_mt = io.Boolean.Input(
"row_mt",
default=True,
display_name="Row Multi-threading",
tooltip="Enable row-based multi-threading for faster encoding on multi-core CPUs.",
advanced=True,
)
vp9_tile_columns = io.Combo.Input(
"tile_columns",
options=["auto", "0", "1", "2", "3", "4"],
default="auto",
display_name="Tile Columns",
tooltip="Number of tile columns (as power of 2). More tiles = faster encoding "
"but slightly worse compression. 'auto' picks based on resolution.",
advanced=True,
)
return io.Schema(
node_id="SaveVideo",
display_name="Save Video",
category="image/video",
description="Saves the input images to your ComfyUI output directory.",
description="Saves video to the output directory. "
"When format/codec/quality differ from source, the video is re-encoded.",
inputs=[
io.Video.Input("video", tooltip="The video to save."),
io.String.Input("filename_prefix", default="video/ComfyUI", tooltip="The prefix for the file to save. This may include formatting information such as %date:yyyy-MM-dd% or %Empty Latent Image.width% to include values from nodes."),
io.Combo.Input("format", options=Types.VideoContainer.as_input(), default="auto", tooltip="The format to save the video as."),
io.Combo.Input("codec", options=Types.VideoCodec.as_input(), default="auto", tooltip="The codec to use for the video."),
io.String.Input(
"filename_prefix",
default="video/ComfyUI",
tooltip="The prefix for the file to save. "
"Supports formatting like %date:yyyy-MM-dd%.",
),
io.DynamicCombo.Input("codec", options=[
io.DynamicCombo.Option("auto", []),
io.DynamicCombo.Option("h264", [h264_quality, h264_speed, h264_profile, h264_tune]),
io.DynamicCombo.Option("vp9", [vp9_quality, vp9_speed, vp9_row_mt, vp9_tile_columns]),
], tooltip="Video codec. 'auto' preserves source when possible. "
"h264 outputs MP4, vp9 outputs WebM."),
],
hidden=[io.Hidden.prompt, io.Hidden.extra_pnginfo],
is_output_node=True,
)
@classmethod
def execute(cls, video: Input.Video, filename_prefix, format: str, codec) -> io.NodeOutput:
def execute(cls, video: Input.Video, filename_prefix: str, codec: dict) -> io.NodeOutput:
selected_codec = codec.get("codec", "auto")
quality = codec.get("quality")
speed_str = codec.get("speed", "auto")
# H264-specific options
profile = codec.get("profile", "auto")
tune = codec.get("tune", "auto")
# VP9-specific options
row_mt = codec.get("row_mt", True)
tile_columns = codec.get("tile_columns", "auto")
if selected_codec == "auto":
resolved_format = Types.VideoContainer.AUTO
resolved_codec = Types.VideoCodec.AUTO
elif selected_codec == "h264":
resolved_format = Types.VideoContainer.MP4
resolved_codec = Types.VideoCodec.H264
elif selected_codec == "vp9":
resolved_format = Types.VideoContainer.WEBM
resolved_codec = Types.VideoCodec.VP9
else:
resolved_format = Types.VideoContainer.AUTO
resolved_codec = Types.VideoCodec.AUTO
speed = None
if speed_str:
try:
speed = Types.VideoSpeedPreset(speed_str)
except (ValueError, TypeError):
logging.warning(f"Invalid speed preset '{speed_str}', using default")
width, height = video.get_dimensions()
full_output_folder, filename, counter, subfolder, filename_prefix = folder_paths.get_save_image_path(
filename_prefix,
@@ -91,6 +207,7 @@ class SaveVideo(io.ComfyNode):
width,
height
)
saved_metadata = None
if not args.disable_metadata:
metadata = {}
@@ -100,12 +217,20 @@ class SaveVideo(io.ComfyNode):
metadata["prompt"] = cls.hidden.prompt
if len(metadata) > 0:
saved_metadata = metadata
file = f"{filename}_{counter:05}_.{Types.VideoContainer.get_extension(format)}"
extension = Types.VideoContainer.get_extension(resolved_format)
file = f"{filename}_{counter:05}_.{extension}"
video.save_to(
os.path.join(full_output_folder, file),
format=Types.VideoContainer(format),
codec=codec,
metadata=saved_metadata
format=resolved_format,
codec=resolved_codec,
metadata=saved_metadata,
quality=quality,
speed=speed,
profile=profile if profile != "auto" else None,
tune=tune if tune != "auto" else None,
row_mt=row_mt,
tile_columns=int(tile_columns) if tile_columns != "auto" else None,
)
return io.NodeOutput(ui=ui.PreviewVideo([ui.SavedResult(file, subfolder, io.FolderType.output)]))

View File

@@ -1,3 +1,3 @@
# This file is automatically generated by the build process when version is
# updated in pyproject.toml.
__version__ = "0.8.2"
__version__ = "0.9.2"

View File

@@ -788,6 +788,7 @@ class VAELoader:
#TODO: scale factor?
def load_vae(self, vae_name):
metadata = None
if vae_name == "pixel_space":
sd = {}
sd["pixel_space_vae"] = torch.tensor(1.0)
@@ -798,8 +799,8 @@ class VAELoader:
vae_path = folder_paths.get_full_path_or_raise("vae_approx", vae_name)
else:
vae_path = folder_paths.get_full_path_or_raise("vae", vae_name)
sd = comfy.utils.load_torch_file(vae_path)
vae = comfy.sd.VAE(sd=sd)
sd, metadata = comfy.utils.load_torch_file(vae_path, return_metadata=True)
vae = comfy.sd.VAE(sd=sd, metadata=metadata)
vae.throw_exception_if_invalid()
return (vae,)
@@ -2132,12 +2133,6 @@ def get_module_name(module_path: str) -> str:
base_path = os.path.splitext(base_path)[0]
return base_path
def get_hacky_folder_paths():
hacky_folder_paths = []
for name, values in folder_paths.folder_names_and_paths.items():
if len(values) > 2:
hacky_folder_paths.append([name, values])
return hacky_folder_paths
async def load_custom_node(module_path: str, ignore=set(), module_parent="custom_nodes") -> bool:
module_name = get_module_name(module_path)
@@ -2247,7 +2242,6 @@ async def init_external_custom_nodes():
base_node_names = set(NODE_CLASS_MAPPINGS.keys())
node_paths = folder_paths.get_folder_paths("custom_nodes")
node_import_times = []
found_first_hacky_folder_path = False
for custom_node_path in node_paths:
possible_modules = os.listdir(os.path.realpath(custom_node_path))
if "__pycache__" in possible_modules:
@@ -2271,12 +2265,6 @@ async def init_external_custom_nodes():
time_before = time.perf_counter()
success = await load_custom_node(module_path, base_node_names, module_parent="custom_nodes")
node_import_times.append((time.perf_counter() - time_before, module_path, success))
if not found_first_hacky_folder_path:
hacky_folder_paths = get_hacky_folder_paths()
if len(hacky_folder_paths) > 0:
logging.warning(f"Found first custom node to have hacky folder paths: {module_path}")
logging.warning(f"Hacky folder paths: {hacky_folder_paths}")
found_first_hacky_folder_path = True
if len(node_import_times) > 0:
logging.info("\nImport times for custom nodes:")
@@ -2413,6 +2401,7 @@ async def init_builtin_api_nodes():
"nodes_sora.py",
"nodes_topaz.py",
"nodes_tripo.py",
"nodes_meshy.py",
"nodes_moonvalley.py",
"nodes_rodin.py",
"nodes_gemini.py",

View File

@@ -1,6 +1,6 @@
[project]
name = "ComfyUI"
version = "0.8.2"
version = "0.9.2"
readme = "README.md"
license = { file = "LICENSE" }
requires-python = ">=3.10"

View File

@@ -1,6 +1,6 @@
comfyui-frontend-package==1.36.13
comfyui-workflow-templates==0.7.69
comfyui-embedded-docs==0.3.1
comfyui-frontend-package==1.36.14
comfyui-workflow-templates==0.8.11
comfyui-embedded-docs==0.4.0
torch
torchsde
torchvision
@@ -21,7 +21,7 @@ psutil
alembic
SQLAlchemy
av>=14.2.0
comfy-kitchen>=0.2.5
comfy-kitchen>=0.2.6
#non essential dependencies:
kornia>=0.7.1

View File

@@ -686,7 +686,10 @@ class PromptServer():
@routes.get("/object_info")
async def get_object_info(request):
seed_assets(["models"])
try:
seed_assets(["models"])
except Exception as e:
logging.error(f"Failed to seed assets: {e}")
with folder_paths.cache_helper:
out = {}
for x in nodes.NODE_CLASS_MAPPINGS:

View File

@@ -6,7 +6,7 @@ import av
import io
from fractions import Fraction
from comfy_api.input_impl.video_types import VideoFromFile, VideoFromComponents
from comfy_api.util.video_types import VideoComponents
from comfy_api.util.video_types import VideoComponents, VideoSpeedPreset, quality_to_crf
from comfy_api.input.basic_types import AudioInput
from av.error import InvalidDataError
@@ -237,3 +237,71 @@ def test_duration_consistency(video_components):
manual_duration = float(components.images.shape[0] / components.frame_rate)
assert duration == pytest.approx(manual_duration)
class TestVideoSpeedPreset:
"""Tests for VideoSpeedPreset enum and its methods."""
def test_as_input_returns_all_values(self):
"""as_input() returns all preset values"""
values = VideoSpeedPreset.as_input()
assert values == ["auto", "Fastest", "Fast", "Balanced", "Quality", "Best"]
def test_to_ffmpeg_preset_h264(self):
"""H.264 presets map correctly"""
assert VideoSpeedPreset.FASTEST.to_ffmpeg_preset("h264") == "ultrafast"
assert VideoSpeedPreset.FAST.to_ffmpeg_preset("h264") == "veryfast"
assert VideoSpeedPreset.BALANCED.to_ffmpeg_preset("h264") == "medium"
assert VideoSpeedPreset.QUALITY.to_ffmpeg_preset("h264") == "slow"
assert VideoSpeedPreset.BEST.to_ffmpeg_preset("h264") == "veryslow"
assert VideoSpeedPreset.AUTO.to_ffmpeg_preset("h264") == "medium"
def test_to_ffmpeg_preset_vp9(self):
"""VP9 presets map correctly"""
assert VideoSpeedPreset.FASTEST.to_ffmpeg_preset("vp9") == "0"
assert VideoSpeedPreset.FAST.to_ffmpeg_preset("vp9") == "1"
assert VideoSpeedPreset.BALANCED.to_ffmpeg_preset("vp9") == "2"
assert VideoSpeedPreset.QUALITY.to_ffmpeg_preset("vp9") == "3"
assert VideoSpeedPreset.BEST.to_ffmpeg_preset("vp9") == "4"
assert VideoSpeedPreset.AUTO.to_ffmpeg_preset("vp9") == "2"
def test_to_ffmpeg_preset_libvpx_vp9(self):
"""libvpx-vp9 codec string also maps to VP9 presets"""
assert VideoSpeedPreset.BALANCED.to_ffmpeg_preset("libvpx-vp9") == "2"
def test_to_ffmpeg_preset_default_to_h264(self):
"""Unknown codecs default to H.264 mapping"""
assert VideoSpeedPreset.BALANCED.to_ffmpeg_preset("unknown") == "medium"
class TestQualityToCrf:
"""Tests for quality_to_crf helper function."""
def test_h264_quality_boundaries(self):
"""H.264 quality maps to correct CRF range (12-40)"""
assert quality_to_crf(100, "h264") == 12
assert quality_to_crf(0, "h264") == 40
assert quality_to_crf(50, "h264") == 26
def test_h264_libx264_alias(self):
"""libx264 codec string uses H.264 mapping"""
assert quality_to_crf(100, "libx264") == 12
def test_vp9_quality_boundaries(self):
"""VP9 quality maps to correct CRF range (15-50)"""
assert quality_to_crf(100, "vp9") == 15
assert quality_to_crf(0, "vp9") == 50
assert quality_to_crf(50, "vp9") == 32
def test_vp9_libvpx_alias(self):
"""libvpx-vp9 codec string uses VP9 mapping"""
assert quality_to_crf(100, "libvpx-vp9") == 15
def test_quality_clamping(self):
"""Quality values outside 0-100 are clamped"""
assert quality_to_crf(150, "h264") == 12
assert quality_to_crf(-50, "h264") == 40
def test_unknown_codec_fallback(self):
"""Unknown codecs return default CRF 23"""
assert quality_to_crf(50, "unknown_codec") == 23

View File

@@ -153,9 +153,9 @@ class TestMixedPrecisionOps(unittest.TestCase):
state_dict2 = model.state_dict()
# Verify layer1.weight is a QuantizedTensor with scale preserved
self.assertIsInstance(state_dict2["layer1.weight"], QuantizedTensor)
self.assertEqual(state_dict2["layer1.weight"]._params.scale.item(), 3.0)
self.assertEqual(state_dict2["layer1.weight"]._layout_cls, "TensorCoreFP8E4M3Layout")
self.assertTrue(torch.equal(state_dict2["layer1.weight"].view(torch.uint8), fp8_weight.view(torch.uint8)))
self.assertEqual(state_dict2["layer1.weight_scale"].item(), 3.0)
self.assertEqual(model.layer1.weight._layout_cls, "TensorCoreFP8E4M3Layout")
# Verify non-quantized layers are standard tensors
self.assertNotIsInstance(state_dict2["layer2.weight"], QuantizedTensor)