mirror of https://github.com/Comfy-Org/ComfyUI_frontend.git synced 2026-04-20 14:30:41 +00:00

Files

dante01yoon bbd0a6b201 feat: migrate workflow template site as apps/hub

Migrate workflow_templates/site into the frontend monorepo as apps/hub
so the hub can use @comfyorg/design-system and shared packages.

Changes to existing files:
- pnpm-workspace.yaml: add @astrojs/sitemap, @astrojs/vercel, lucide-vue-next
- eslint.config.ts: add hub ignores and i18n/import rule overrides
- .oxlintrc.json: add hub scripts to ignore patterns
- knip.config.ts: add hub workspace config

apps/hub adaptations from source:
- Replace local cn() with @comfyorg/tailwind-utils (19 files)
- Integrate @comfyorg/design-system/css/base.css in global.css
- Make TEMPLATES_DIR configurable via HUB_TEMPLATES_DIR env var
- Add HUB_SKIP_SYNC flag for builds without template data
- Remove Vite 8-incompatible rollupOptions.output.manualChunks
- Fix stylelint violations (modern color notation, number precision)
- Gitignore generated content (thumbnails, synced templates, AI cache)

2026-04-06 20:53:13 +09:00

2.9 KiB

Raw Blame History

Stable Diffusion

Stable Diffusion is Stability AI's family of open-source image and video generation models, spanning multiple architectures from U-Net to diffusion transformers.

Model Variants

SDXL (Stable Diffusion XL)

Stability AI's flagship text-to-image model (6.6B parameter U-Net)
Native 1024x1024 resolution with flexible aspect ratios around 1MP
Two text encoders (CLIP ViT-L + OpenCLIP ViT-bigG)
Optional refiner model for second-stage detail enhancement
Turbo and Lightning distilled variants for 1-4 step generation
Largest ecosystem of LoRAs, fine-tunes, and community models

SD3.5 (Stable Diffusion 3.5)

Diffusion transformer (DiT) architecture, successor to SDXL
Three text encoders (CLIP ViT-L, OpenCLIP ViT-bigG, T5-XXL) for stronger prompt following
Available in Large (8B) and Medium (2B) parameter sizes
Improved text rendering and compositional accuracy over SDXL
4 workflow templates available

SD1.5 (Stable Diffusion 1.5)

The classic 512x512 latent diffusion model
Single CLIP ViT-L text encoder, 860M parameter U-Net
Still widely used for its massive LoRA and checkpoint ecosystem
Lower VRAM requirements make it accessible on consumer hardware
2 workflow templates available

SVD (Stable Video Diffusion)

Image-to-video generation model based on Stable Diffusion
Generates short video clips (14 or 25 frames) from a single image
Related model for motion generation from static inputs

Stability API Products

Reimagine: Stability's API-based image variation and transformation service

Key Features

Excellent composition, layout, and photorealism (SDXL/SD3.5)
Large open-source ecosystem with thousands of community fine-tunes
Flexible aspect ratios and multi-resolution support
Dual/triple CLIP text encoding for nuanced prompt interpretation
Strong text rendering in SD3.5 via T5-XXL encoder

Hardware Requirements

SD1.5: 4-6GB VRAM (fp16), runs on most consumer GPUs
SDXL Base: 8GB VRAM minimum (fp16), 12GB recommended
SDXL Base + Refiner: 16GB+ VRAM
SD3.5 Medium: 8-12GB VRAM
SD3.5 Large: 16-24GB VRAM (fp16), quantized versions for 12GB cards

Common Use Cases

Photorealistic image generation
Artistic illustrations and concept art
Product photography and design
Character and portrait generation
LoRA-based custom style and subject training
Image-to-video with SVD

Key Parameters

steps: 20-40 for SDXL base, 15-25 for refiner, 28+ for SD3.5
cfg_scale: 5-10 (7 default for SDXL), 3.5-7 for SD3.5
sampler: DPM++ 2M Karras and Euler are popular for SDXL; Euler for SD3.5
resolution: 1024x1024 native for SDXL/SD3.5, 512x512 for SD1.5
clip_skip: Often set to 1-2; important for SD1.5 LoRA compatibility
denoise_strength: 0.7-0.8 when using the SDXL refiner (img2img)
negative_prompt: Supported in SDXL/SD1.5; not used in SD3.5 by default

2.9 KiB Raw Blame History