- Add AssetSeeder singleton class with thread management and cancellation - Support IDLE/RUNNING/CANCELLING state machine with thread-safe access - Emit WebSocket events for scan progress (started, progress, completed, cancelled, error) - Update main.py to use non-blocking asset_seeder.start() at startup - Add shutdown() call in finally block for graceful cleanup - Update POST /api/assets/seed to return 202 Accepted, support ?wait=true - Add GET /api/assets/seed/status and POST /api/assets/seed/cancel endpoints - Update test helper to use ?wait=true for synchronous behavior - Add 17 unit tests covering state transitions, cancellation, and thread safety - Log scan configuration (models directory, input/output paths) at scan start Amp-Thread-ID: https://ampcode.com/threads/T-019c2b45-e6e8-740a-b38b-b11daea8d094 Co-authored-by: Amp <amp@ampcode.com>
8.7 KiB
Background Asset Seeder Design Document
Problem Statement
The seed_assets function in app/assets/scanner.py scans filesystem directories and imports assets into the database. Currently it runs synchronously, which causes two problems:
-
Startup blocking: When ComfyUI starts,
setup_database()callsseed_assets(["models"])synchronously. If the models directory contains thousands of files, startup is delayed significantly before the UI becomes available. -
API request blocking: The
POST /api/assets/seedendpoint runs synchronously, blocking the HTTP request until scanning completes. For large directories, this causes request timeouts and a poor user experience.
Goals
- Move asset scanning to a background thread so startup and API requests return immediately
- Provide visibility into scan progress via API and WebSocket events
- Support graceful cancellation of in-progress scans
- Maintain backward compatibility for tests that rely on synchronous behavior
- Ensure thread safety when accessing shared state
Non-Goals
- Targeted/priority scanning of specific paths (addressed separately via synchronous
scan_paths()) - Parallel scanning across multiple threads
- Persistent scan state across restarts
Architecture Overview
Component: AssetSeeder Singleton
A new AssetSeeder class in app/assets/seeder.py manages background scanning with the following responsibilities:
- Owns a single background
threading.Threadfor scanning work - Tracks scan state and progress in a thread-safe manner
- Provides cancellation support via
threading.Event - Emits WebSocket events for UI progress updates
State Machine
IDLE ──start()──► RUNNING ──(completes)──► IDLE
│
cancel()
│y
▼
CANCELLING ──(thread exits)──► IDLE
Integration Points
| Component | Change |
|---|---|
main.py |
Call asset_seeder.start() (non-blocking) instead of seed_assets() |
main.py |
Call asset_seeder.shutdown() in finally block alongside cleanup_temp() |
| API routes | New endpoints for status and cancellation |
| WebSocket | Emit progress events during scanning |
| Test helper | Use ?wait=true query param for synchronous behavior |
Tasks
Task 1: Create AssetSeeder Class
Description: Implement the core AssetSeeder singleton class with thread management, state tracking, and cancellation support.
Acceptance Criteria:
AssetSeederis a singleton accessible via module-level instanceStateenum with values:IDLE,RUNNING,CANCELLINGstart(roots)method spawns a daemon thread and returns immediatelystart()is idempotent—calling while already running is a no-op and returnsFalsecancel()method signals the thread to stop gracefullywait(timeout)method blocks until thread completes or timeout expiresget_status()returns current state and progress information- All state access is protected by
threading.Lock - Thread creates its own database sessions (no session sharing across threads)
- Progress tuple tracks:
(scanned, total, created, skipped) - Errors during scanning are captured and available via
get_status()
Task 2: Add Cancellation Checkpoints
Description: Modify the scanning logic to check for cancellation between batches, allowing graceful early termination.
Acceptance Criteria:
threading.Eventis checked between batch operations- When cancellation is requested, current batch completes before stopping
- Partial progress is committed (assets already scanned remain in database)
- State transitions to
IDLEafter cancellation completes - Cancellation is logged with partial progress statistics
Task 3: Update Startup Integration
Description: Modify main.py to use non-blocking asset seeding at startup.
Acceptance Criteria:
setup_database()callsasset_seeder.start(roots=["models"])instead ofseed_assets()- Startup proceeds immediately without waiting for scan completion
asset_seeder.shutdown()called infinallyblock alongsidecleanup_temp()--disable-assets-autoscanflag continues to skip seeding entirely- Startup logs indicate background scan was initiated (not completed)
Task 4: Create API Endpoints
Description: Add REST endpoints for triggering, monitoring, and cancelling background scans.
Endpoints:
POST /api/assets/seed
Trigger a background scan for specified roots.
Acceptance Criteria:
- Accepts
{"roots": ["models", "input", "output"]}in request body - Returns
202 Acceptedwith{"status": "started"}when scan begins - Returns
409 Conflictwith{"status": "already_running"}if scan in progress - Supports
?wait=truequery param for synchronous behavior (blocks until complete) - With
?wait=true, returns200 OKwith final statistics on completion
GET /api/assets/seed/status
Get current scan status and progress.
Acceptance Criteria:
- Returns
{"state": "IDLE|RUNNING|CANCELLING", "progress": {...}, "errors": [...]} - Progress object includes:
scanned,total,created,skipped - When idle, progress reflects last completed scan (or null if never run)
- Errors array contains messages from any failures during last/current scan
POST /api/assets/seed/cancel
Request cancellation of in-progress scan.
Acceptance Criteria:
- Returns
200 OKwith{"status": "cancelling"}if scan was running - Returns
200 OKwith{"status": "idle"}if no scan was running - Cancellation is graceful—does not corrupt database state
Task 5: Add WebSocket Progress Events
Description: Emit WebSocket events during scanning so the UI can display progress.
Acceptance Criteria:
- Event type:
assets.seed.startedwith{"roots": [...], "total": N} - Event type:
assets.seed.progresswith{"scanned": N, "total": M, "created": C} - Event type:
assets.seed.completedwith final statistics - Event type:
assets.seed.cancelledif scan was cancelled - Event type:
assets.seed.errorif scan failed with error message - Progress events emitted at reasonable intervals (not every file, ~every 100 files or 1 second)
Task 6: Update Test Helper
Description: Modify the test helper to use synchronous behavior via query parameter.
Acceptance Criteria:
trigger_sync_seed_assets()uses?wait=truequery parameter- Tests continue to pass with synchronous blocking behavior
- Remove artificial
time.sleep(0.2)delay (no longer needed withwait=true)
Task 7: Unit Tests for AssetSeeder
Description: Add unit tests covering the seeder state machine and thread safety.
Acceptance Criteria:
- Test:
start()transitions state from IDLE to RUNNING - Test:
start()while RUNNING returns False (idempotent) - Test:
cancel()transitions state from RUNNING to CANCELLING - Test:
wait()blocks until thread completes - Test:
wait(timeout)returns False if timeout expires - Test:
get_status()returns correct progress during scan - Test: Concurrent
start()calls are safe (only one thread spawned) - Test: Scan commits partial progress on cancellation
- Test: Database errors are captured in status, don't crash thread
Thread Safety Considerations
| Shared Resource | Protection Strategy |
|---|---|
_state enum |
Protected by threading.Lock |
_progress tuple |
Protected by threading.Lock |
_errors list |
Protected by threading.Lock |
_thread reference |
Protected by threading.Lock |
_cancel_event |
threading.Event (inherently thread-safe) |
| Database sessions | Created per-operation inside thread (no sharing) |
Error Handling
- Database connection failures: Log error, set state to IDLE, populate errors list
- Individual file scan failures: Log warning, continue with next file, increment error count
- Thread crashes: Caught by outer try/except, state reset to IDLE, error captured
Future Considerations
- Priority queue: If targeted scans need to be non-blocking in the future, the seeder could be extended with a priority queue
- Persistent state: Scan progress could be persisted to allow resume after restart
- Parallel scanning: Multiple threads could scan different roots concurrently (requires careful session management)
- Throttling: If scanning competes with generation (e.g., disk I/O contention when hashing large files), add configurable sleep between batches. Currently considered low risk since scanning is I/O-bound and generation is GPU-bound.