mirror of https://github.com/Comfy-Org/ComfyUI_frontend.git synced 2026-04-20 14:30:41 +00:00

Files

dante01yoon bbd0a6b201 feat: migrate workflow template site as apps/hub

Migrate workflow_templates/site into the frontend monorepo as apps/hub
so the hub can use @comfyorg/design-system and shared packages.

Changes to existing files:
- pnpm-workspace.yaml: add @astrojs/sitemap, @astrojs/vercel, lucide-vue-next
- eslint.config.ts: add hub ignores and i18n/import rule overrides
- .oxlintrc.json: add hub scripts to ignore patterns
- knip.config.ts: add hub workspace config

apps/hub adaptations from source:
- Replace local cn() with @comfyorg/tailwind-utils (19 files)
- Integrate @comfyorg/design-system/css/base.css in global.css
- Make TEMPLATES_DIR configurable via HUB_TEMPLATES_DIR env var
- Add HUB_SKIP_SYNC flag for builds without template data
- Remove Vite 8-incompatible rollupOptions.output.manualChunks
- Fix stylelint violations (modern color notation, number precision)
- Gitignore generated content (thumbnails, synced templates, AI cache)

2026-04-06 20:53:13 +09:00

1.9 KiB

Raw Blame History

OmniGen2

OmniGen2 is a multimodal generation model with dual decoding pathways for text and image, built on the Qwen-VL-2.5 foundation by VectorSpaceLab.

Model Variants

OmniGen2

3B vision-language encoder (Qwen-VL-2.5) + 4B image decoder
Dual decoding with unshared parameters for text and image
Decoupled image tokenizer
Apache 2.0 license

OmniGen v1

Earlier single-pathway architecture
Fewer capabilities than OmniGen2
Superseded by OmniGen2

Key Features

Text-to-image generation with high fidelity and aesthetics
Instruction-guided image editing (state-of-the-art among open-source models)
In-context generation combining multiple reference inputs (humans, objects, scenes)
Visual understanding inherited from Qwen-VL-2.5
CPU offload support reduces VRAM usage by nearly 50%
Sequential CPU offload available for under 3GB VRAM (slower inference)
Supports negative prompts and configurable guidance scales

Hardware Requirements

Minimum: NVIDIA RTX 3090 or equivalent (~17GB VRAM)
With CPU offload: ~9GB VRAM
With sequential CPU offload: under 3GB VRAM (significantly slower)
Flash Attention optional but recommended for best performance
CUDA 12.4+ recommended
Default output resolution: 1024x1024

Common Use Cases

Text-to-image generation
Instruction-based photo editing
Subject-driven image generation from reference photos
Multi-image composition and in-context editing

Key Parameters

text_guidance_scale: Controls adherence to text prompt (CFG)
image_guidance_scale: Controls similarity to reference image (1.2-2.0 for editing, 2.5-3.0 for in-context)
num_inference_step: Diffusion steps (default 50)
max_pixels: Maximum total pixel count for input images (default 1024x1024)
negative_prompt: Text describing undesired qualities (e.g., "blurry, low quality, watermark")
scheduler: ODE solver choice (euler or dpmsolver++)

1.9 KiB Raw Blame History

OmniGen2

Model Variants

OmniGen2

OmniGen v1

Key Features

Hardware Requirements

Common Use Cases

Key Parameters

1.9 KiB

Raw Blame History