mirror of https://github.com/Comfy-Org/ComfyUI_frontend.git synced 2026-04-20 14:30:41 +00:00

Files

dante01yoon bbd0a6b201 feat: migrate workflow template site as apps/hub

Migrate workflow_templates/site into the frontend monorepo as apps/hub
so the hub can use @comfyorg/design-system and shared packages.

Changes to existing files:
- pnpm-workspace.yaml: add @astrojs/sitemap, @astrojs/vercel, lucide-vue-next
- eslint.config.ts: add hub ignores and i18n/import rule overrides
- .oxlintrc.json: add hub scripts to ignore patterns
- knip.config.ts: add hub workspace config

apps/hub adaptations from source:
- Replace local cn() with @comfyorg/tailwind-utils (19 files)
- Integrate @comfyorg/design-system/css/base.css in global.css
- Make TEMPLATES_DIR configurable via HUB_TEMPLATES_DIR env var
- Add HUB_SKIP_SYNC flag for builds without template data
- Remove Vite 8-incompatible rollupOptions.output.manualChunks
- Fix stylelint violations (modern color notation, number precision)
- Gitignore generated content (thumbnails, synced templates, AI cache)

2026-04-06 20:53:13 +09:00

4.0 KiB

Raw Blame History

Wan

Wan is a family of open-source video generation models from Alibaba's Tongyi Lab, spanning text-to-video, image-to-video, speech-to-video, motion control, and video editing. All models are released under the Apache 2.0 license.

Model Variants

Wan 2.1 T2V / I2V

Text-to-video and image-to-video generation
Available in 1.3B and 14B parameter sizes
Supports 480p and 720p output, variable aspect ratios
Chinese and English visual text generation

Wan 2.1 Fun (Control / InPaint / Camera)

Camera control with predefined or custom camera movements
Video inpainting for targeted frame-level editing
Depth, pose, and canny edge control for guided generation

Wan 2.1 VACE (Video Any-Condition Editing)

All-in-one model for video creation and editing (ICCV 2025)
Reference-to-video (R2V), video-to-video (V2V), and masked editing (MV2V)
Supports inpainting, outpainting, first-last-frame interpolation, and animate-anything
Available in 1.3B and 14B sizes, built on Wan 2.1 base models

Wan 2.2 T2V / I2V / TI2V

Mixture-of-Experts (MoE) architecture with high-noise and low-noise expert models
T2V-A14B and I2V-A14B (14B MoE), TI2V-5B (hybrid text+image-to-video)
Cinematic-level aesthetic control with lighting, composition, and color tone guidance
TI2V-5B uses a high-compression 16×16×4 VAE, runs on consumer GPUs like 4090

Wan 2.2 S2V (Speech-to-Video)

Audio-driven cinematic video generation from image + speech + text
Supports lip-sync, facial expressions, and pose-driven generation
Generates variable-length videos matching input audio duration

Wan 2.2 Animate

Character animation and subject replacement from video + reference image
Animate mode: transfers motion from reference video onto a still character
Replace mode: swaps subjects while preserving background, lighting, and camera motion
Includes relighting LoRA for scene-matched lighting adaptation

Wan Move

Point-level motion control for image-to-video generation (NeurIPS 2025)
Dense trajectory-based guidance for fine-grained object motion
Latent trajectory propagation without extra motion modules
14B model generating 5-second 480p videos

Key Features

High temporal consistency and natural physics simulation
Multiple aspect ratios (16:9, 9:16, 1:1) at 24fps
MoE architecture in 2.2 for higher quality at same compute cost
Bilingual prompt support (Chinese and English)
ComfyUI and Diffusers integration across all variants

Hardware Requirements

1.3B models: 8GB VRAM minimum
14B models: 24GB+ VRAM recommended (80GB for full precision)
TI2V-5B: runs on consumer 4090 GPUs at 720p
FP8 quantization available for lower VRAM configurations
Multi-GPU inference supported via FSDP + DeepSpeed Ulysses

Common Use Cases

Social media and short-form video content
Character animation and motion transfer
Video inpainting and scene editing
Product animation and marketing videos
Speech-driven talking head generation
Storyboard-to-video conversion

Key Parameters

frames: Number of output frames (typically 81 for ~3.4s at 24fps)
steps: Inference steps (20-50 recommended)
cfg_scale: Guidance scale for prompt adherence (3-7 typical)
size: Output resolution (480p or 720p)
model_name: Selects variant (e.g., vace-14B, ti2v-5B, s2v-14B)

Blog References

Wan 2.1 Video Model Native Support — Initial release with 4 model variants, 8.19GB VRAM minimum
Wan 2.1 VACE Native Support — Unified video editing: Move/Swap/Reference/Expand/Animate Anything
Wan 2.2 Day-0 Support — MoE architecture, Apache 2.0 license, cinematic controls
WAN 2.6 Reference-to-Video — Generate videos from reference clips at up to 1080p
The Complete AI Upscaling Handbook — Wan 2.2 used for creative video upscaling

4.0 KiB Raw Blame History Unescape Escape

Wan