- Add is_missing column to AssetCacheState for soft-delete
- Replace hard-delete pruning with mark_cache_states_missing_outside_prefixes
- Auto-restore missing cache states when files are re-scanned
- Filter out missing cache states from queries by default
- Rename functions for clarity:
- mark_cache_states_missing_outside_prefixes (was delete_cache_states_outside_prefixes)
- get_unreferenced_unhashed_asset_ids (was get_orphaned_seed_asset_ids)
- mark_assets_missing_outside_prefixes (was prune_orphaned_assets)
- mark_missing_outside_prefixes_safely (was prune_orphans_safely)
- Add restore_cache_states_by_paths for explicit restoration
- Add cleanup_unreferenced_assets for explicit hard-delete when needed
- Update API endpoint /api/assets/prune to use new soft-delete behavior
This preserves user metadata (tags, etc.) when base directories change,
allowing assets to be restored when the original paths become available again.
Amp-Thread-ID: https://ampcode.com/threads/T-019c3114-bf28-73a9-a4d2-85b208fd5462
Co-authored-by: Amp <amp@ampcode.com>
Rename _sync_root_safely, _prune_orphans_safely, _collect_paths_for_roots,
_build_asset_specs, and _insert_asset_specs to remove underscore prefix
since they are used by seeder.py as part of the public API.
Amp-Thread-ID: https://ampcode.com/threads/T-019c3037-df32-7138-99d8-b4b824d896b3
Co-authored-by: Amp <amp@ampcode.com>
- Remove automatic pruning from scan loop to prevent partial scans from
deleting assets belonging to other roots
- Add get_all_known_prefixes() helper to get prefixes for all root types
- Add prune_orphans() method to AssetSeeder for explicit pruning
- Add prune_first parameter to start() for optional pre-scan pruning
- Add POST /api/assets/prune endpoint for explicit pruning via API
- Update main.py startup to use prune_first=True for full startup scans
- Add tests for new prune_orphans functionality
Fixes issue where a models-only scan would delete all input/output assets.
Amp-Thread-ID: https://ampcode.com/threads/T-019c2ba0-e004-7229-81bf-452b2f7f57a1
Co-authored-by: Amp <amp@ampcode.com>
- Create file_utils.py with shared file utilities:
- get_mtime_ns() - extract mtime in nanoseconds from stat
- get_size_and_mtime_ns() - get both size and mtime
- verify_file_unchanged() - check file matches DB mtime/size
- list_files_recursively() - recursive directory listing
- Create bulk_ingest.py for bulk operations:
- BulkInsertResult dataclass
- batch_insert_seed_assets() - batch insert with conflict handling
- prune_orphaned_assets() - clean up orphaned assets
- Update scanner.py to use new service modules instead of
calling database queries directly
- Update ingest.py to use shared get_size_and_mtime_ns()
- Export new functions from services/__init__.py
Amp-Thread-ID: https://ampcode.com/threads/T-019c2ae7-f701-716a-a0dd-1feb988732fb
Co-authored-by: Amp <amp@ampcode.com>
Scanner is used externally by main.py and server.py for startup/maintenance,
not as part of the regular service layer. Moving it to app/assets/scanner.py
makes the public API clearer.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* Brought over minimal elements from PR 10045 to reproduce seed_assets and register_assets_system without adding anything to the DB or server routes yet, for now making everything sync (can introduce async once everything is cleaned up and brought over)
* Added db script to insert assets stuff, cleaned up some code; assets (models) now get added/rescanned
* Added support for 5 http endpoints for assets
* Replaced Optional with | None in schemas_in.py and schemas_out.py
* Remove two routes that will not be relevant yet in this PR: HEAD /api/assets/hash/<hash> and PUT /api/assets/<id>/preview
* Remove some functions the two deleted endpoints were using
* Don't show assets scan message upon calling /object_info endpoint
* removed unsued import to satisfy ruff
* Simplified hashing function tpye hint and _hash_file_obj
* Satisfied ruff