- Add is_missing column to AssetCacheState for soft-delete
- Replace hard-delete pruning with mark_cache_states_missing_outside_prefixes
- Auto-restore missing cache states when files are re-scanned
- Filter out missing cache states from queries by default
- Rename functions for clarity:
- mark_cache_states_missing_outside_prefixes (was delete_cache_states_outside_prefixes)
- get_unreferenced_unhashed_asset_ids (was get_orphaned_seed_asset_ids)
- mark_assets_missing_outside_prefixes (was prune_orphaned_assets)
- mark_missing_outside_prefixes_safely (was prune_orphans_safely)
- Add restore_cache_states_by_paths for explicit restoration
- Add cleanup_unreferenced_assets for explicit hard-delete when needed
- Update API endpoint /api/assets/prune to use new soft-delete behavior
This preserves user metadata (tags, etc.) when base directories change,
allowing assets to be restored when the original paths become available again.
Amp-Thread-ID: https://ampcode.com/threads/T-019c3114-bf28-73a9-a4d2-85b208fd5462
Co-authored-by: Amp <amp@ampcode.com>
Rename _sync_root_safely, _prune_orphans_safely, _collect_paths_for_roots,
_build_asset_specs, and _insert_asset_specs to remove underscore prefix
since they are used by seeder.py as part of the public API.
Amp-Thread-ID: https://ampcode.com/threads/T-019c3037-df32-7138-99d8-b4b824d896b3
Co-authored-by: Amp <amp@ampcode.com>
- Remove automatic pruning from scan loop to prevent partial scans from
deleting assets belonging to other roots
- Add get_all_known_prefixes() helper to get prefixes for all root types
- Add prune_orphans() method to AssetSeeder for explicit pruning
- Add prune_first parameter to start() for optional pre-scan pruning
- Add POST /api/assets/prune endpoint for explicit pruning via API
- Update main.py startup to use prune_first=True for full startup scans
- Add tests for new prune_orphans functionality
Fixes issue where a models-only scan would delete all input/output assets.
Amp-Thread-ID: https://ampcode.com/threads/T-019c2ba0-e004-7229-81bf-452b2f7f57a1
Co-authored-by: Amp <amp@ampcode.com>
- Add AssetSeeder singleton class with thread management and cancellation
- Support IDLE/RUNNING/CANCELLING state machine with thread-safe access
- Emit WebSocket events for scan progress (started, progress, completed, cancelled, error)
- Update main.py to use non-blocking asset_seeder.start() at startup
- Add shutdown() call in finally block for graceful cleanup
- Update POST /api/assets/seed to return 202 Accepted, support ?wait=true
- Add GET /api/assets/seed/status and POST /api/assets/seed/cancel endpoints
- Update test helper to use ?wait=true for synchronous behavior
- Add 17 unit tests covering state transitions, cancellation, and thread safety
- Log scan configuration (models directory, input/output paths) at scan start
Amp-Thread-ID: https://ampcode.com/threads/T-019c2b45-e6e8-740a-b38b-b11daea8d094
Co-authored-by: Amp <amp@ampcode.com>
- Create file_utils.py with shared file utilities:
- get_mtime_ns() - extract mtime in nanoseconds from stat
- get_size_and_mtime_ns() - get both size and mtime
- verify_file_unchanged() - check file matches DB mtime/size
- list_files_recursively() - recursive directory listing
- Create bulk_ingest.py for bulk operations:
- BulkInsertResult dataclass
- batch_insert_seed_assets() - batch insert with conflict handling
- prune_orphaned_assets() - clean up orphaned assets
- Update scanner.py to use new service modules instead of
calling database queries directly
- Update ingest.py to use shared get_size_and_mtime_ns()
- Export new functions from services/__init__.py
Amp-Thread-ID: https://ampcode.com/threads/T-019c2ae7-f701-716a-a0dd-1feb988732fb
Co-authored-by: Amp <amp@ampcode.com>
- Delete app/assets/manager.py
- Move upload logic (upload_from_temp_path, create_from_hash) to ingest service
- Add HashMismatchError and DependencyMissingError to ingest service
- Add UploadResult schema for upload responses
- Update routes.py to import services directly and do schema conversion inline
- Add asset lookup/listing service functions to asset_management.py
Routes now call the service layer directly, removing an unnecessary
layer of indirection. The manager was only converting between service
dataclasses and Pydantic response schemas.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add typed result dataclasses: IngestResult, AddTagsResult,
RemoveTagsResult, SetTagsResult, TagUsage
- Add UserMetadata type alias for user_metadata parameters
- Type helper functions with Session parameters
- Use TypedDicts at query layer to avoid circular imports
- Update manager.py and tests to use attribute access
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Scanner is used externally by main.py and server.py for startup/maintenance,
not as part of the regular service layer. Moving it to app/assets/scanner.py
makes the public API clearer.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- test_metadata.py: consolidate 7 filter type classes into parametrized tests
- test_asset.py: parametrize exists, get, and upsert test cases
- test_cache_state.py: parametrize upsert and delete scenarios
- test_crud.py: consolidate error response tests into single parametrized test
- test_list_filter.py: consolidate invalid query tests
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The helper functions are already imported directly from helpers.py
by all test files, so the backwards compatibility re-export is dead code.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add update_asset_info_name and update_asset_info_updated_at query functions
and update asset_management.py to use them instead of modifying ORM objects
directly. This ensures the service layer only uses explicit operations from
the queries package.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Replace dict/ORM object returns with explicit dataclasses to fix
DetachedInstanceError when accessing ORM attributes after session closes.
- Add app/assets/services/schemas.py with AssetData, AssetInfoData,
AssetDetailResult, and RegisterAssetResult dataclasses
- Update asset_management.py and ingest.py to return dataclasses
- Update manager.py to use attribute access on dataclasses
- Fix created_new to be False in create_asset_from_hash (content exists)
- Add DependencyMissingError for better blake3 missing error handling
- Update tests to use attribute access instead of dict subscripting
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Extract helper functions from conftest.py to a dedicated helpers.py module
to fix import resolution issues when pytest processes subdirectories.
Rename test_tags.py to test_tags_api.py to avoid module name collision
with queries/test_tags.py.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Make blake3 an optional import that fails gracefully at import time,
with a clear error message when hashing functions are actually called.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Extract focused helper functions to eliminate the try-finally block that
wrapped ~50 lines just for logging. The new helpers (_collect_paths_for_roots,
_build_asset_specs, _insert_asset_specs) make seed_assets a simple linear flow.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Extract helper functions to eliminate nested try-except blocks in scanner.py
and remove duplicated type-checking logic in asset_info.py. Simplify nested
conditionals in asset_management.py for clearer control flow.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Remove unused os imports in conftest.py and test_ingest.py
- Remove unused Tag import in test_asset_management.py
- Remove unused ensure_tags_exist import in test_ingest.py
- Fix unused info2 variable in test_asset_management.py
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Architecture changes:
- API Routes -> manager.py (thin adapter) -> services/ (business logic) -> queries/ (atomic DB ops)
- Services own session lifecycle via create_session()
- Queries accept Session as parameter, do single-table atomic operations
New app/assets/services/ layer:
- __init__.py - exports all service functions
- ingest.py - ingest_file_from_path(), register_existing_asset()
- asset_management.py - get_asset_detail(), update_asset_metadata(), delete_asset_reference(), set_asset_preview()
- tagging.py - apply_tags(), remove_tags(), list_tags()
Removed from queries/asset_info.py:
- ingest_fs_asset (moved to services/ingest.py as ingest_file_from_path)
- update_asset_info_full (moved to services/asset_management.py as update_asset_metadata)
- create_asset_info_for_existing_asset (moved to services/ingest.py as register_existing_asset)
Updated manager.py:
- Now a thin adapter that transforms API schemas to/from service calls
- Delegates all business logic to services layer
- No longer imports sqlalchemy.orm.Session or models directly
Test updates:
- Fixed test_cache_state.py import of pick_best_live_path (moved to helpers.py)
- Added comprehensive service layer tests (41 new tests)
- All 112 query + service tests pass
Amp-Thread-ID: https://ampcode.com/threads/T-019c24e2-7ae4-707f-ad19-c775ed8b82b5
Co-authored-by: Amp <amp@ampcode.com>
* model_management: lazy-cache aimdo_tensor
These tensors cosntructed from aimdo-allocations are CPU expensive to
make on the pytorch side. Add a cache version that will be valid with
signature match to fast path past whatever torch is doing.
* dynamic_vram: Minimize fast path CPU work
Move as much as possible inside the not resident if block and cache
the formed weight and bias rather than the flat intermediates. In
extreme layer weight rates this adds up.
* Fix bypass dtype/device moving
* Force offloading mode for training
* training context var
* offloading implementation in training node
* fix wrong input type
* Support bypass load lora model, correct adapter/offloading handling