Commit Graph

6 Commits

Author SHA1 Message Date
Luke Mino-Altherr
6708c02446 Fix is_missing state updates for asset cache states on startup
- Add bulk_update_is_missing() to efficiently update is_missing flag
- Update sync_cache_states_with_filesystem() to mark non-existent files as is_missing=True
- Call restore_cache_states_by_paths() in batch_insert_seed_assets() to restore
  previously-missing states when files reappear during scanning

Amp-Thread-ID: https://ampcode.com/threads/T-019c3177-e591-7666-ac6b-7e05c71c8ebf
Co-authored-by: Amp <amp@ampcode.com>
2026-02-06 10:09:19 -08:00
Luke Mino-Altherr
32011c403b refactor(bulk_ingest): improve variable naming and add typed dicts
- Rename shorthand variables to explicit names (sp -> spec, aid -> asset_id, etc.)
- Move imports to top of file
- Add TypedDict definitions for AssetRow, CacheStateRow, AssetInfoRow, TagRow, MetadataRow
- Replace bare dict types with typed alternatives

Amp-Thread-ID: https://ampcode.com/threads/T-019c316d-13f7-77f8-b92b-ea7276c3e09c
Co-authored-by: Amp <amp@ampcode.com>
2026-02-05 21:32:01 -08:00
Luke Mino-Altherr
9222ff6d81 feat: non-destructive asset pruning with is_missing flag
- Add is_missing column to AssetCacheState for soft-delete
- Replace hard-delete pruning with mark_cache_states_missing_outside_prefixes
- Auto-restore missing cache states when files are re-scanned
- Filter out missing cache states from queries by default
- Rename functions for clarity:
  - mark_cache_states_missing_outside_prefixes (was delete_cache_states_outside_prefixes)
  - get_unreferenced_unhashed_asset_ids (was get_orphaned_seed_asset_ids)
  - mark_assets_missing_outside_prefixes (was prune_orphaned_assets)
  - mark_missing_outside_prefixes_safely (was prune_orphans_safely)
- Add restore_cache_states_by_paths for explicit restoration
- Add cleanup_unreferenced_assets for explicit hard-delete when needed
- Update API endpoint /api/assets/prune to use new soft-delete behavior

This preserves user metadata (tags, etc.) when base directories change,
allowing assets to be restored when the original paths become available again.

Amp-Thread-ID: https://ampcode.com/threads/T-019c3114-bf28-73a9-a4d2-85b208fd5462
Co-authored-by: Amp <amp@ampcode.com>
2026-02-05 21:21:46 -08:00
Luke Mino-Altherr
28c4b58dd6 Make ingest_file_from_path and register_existing_asset private
Amp-Thread-ID: https://ampcode.com/threads/T-019c2fe5-a3de-71cc-a6e5-67fe944a101e
Co-authored-by: Amp <amp@ampcode.com>
2026-02-05 14:26:36 -08:00
Luke Mino-Altherr
fa3749ced7 Add TypedDict types to scanner and bulk_ingest
Amp-Thread-ID: https://ampcode.com/threads/T-019c2af9-4d41-73e9-b38d-78d06bc28a3f
Co-authored-by: Amp <amp@ampcode.com>
2026-02-04 15:32:26 -08:00
Luke Mino-Altherr
abeec3072b refactor(assets): extract scanner logic into service modules
- Create file_utils.py with shared file utilities:
  - get_mtime_ns() - extract mtime in nanoseconds from stat
  - get_size_and_mtime_ns() - get both size and mtime
  - verify_file_unchanged() - check file matches DB mtime/size
  - list_files_recursively() - recursive directory listing

- Create bulk_ingest.py for bulk operations:
  - BulkInsertResult dataclass
  - batch_insert_seed_assets() - batch insert with conflict handling
  - prune_orphaned_assets() - clean up orphaned assets

- Update scanner.py to use new service modules instead of
  calling database queries directly

- Update ingest.py to use shared get_size_and_mtime_ns()

- Export new functions from services/__init__.py

Amp-Thread-ID: https://ampcode.com/threads/T-019c2ae7-f701-716a-a0dd-1feb988732fb
Co-authored-by: Amp <amp@ampcode.com>
2026-02-04 15:17:31 -08:00