Luke Mino-Altherr
8ff4d38ad1
refactor(assets): merge AssetInfo and AssetCacheState into AssetReference
...
This change solves the basename collision bug by using UNIQUE(file_path) on the
unified asset_references table. Key changes:
Database:
- Migration 0005 merges asset_cache_states and asset_infos into asset_references
- AssetReference now contains: cache state fields (file_path, mtime_ns, needs_verify,
is_missing, enrichment_level) plus info fields (name, owner_id, preview_id, etc.)
- AssetReferenceMeta replaces AssetInfoMeta
- AssetReferenceTag replaces AssetInfoTag
- UNIQUE constraint on file_path prevents duplicate entries for same file
Code:
- New unified query module: asset_reference.py (replaces asset_info.py, cache_state.py)
- Updated scanner, seeder, and services to use AssetReference
- Updated API routes to use reference_id instead of asset_info_id
Tests:
- All 175 unit tests updated and passing
- Integration tests require server environment (not run here)
Amp-Thread-ID: https://ampcode.com/threads/T-019c4fe8-9dcb-75ce-bea8-ea786343a581
Co-authored-by: Amp <amp@ampcode.com >
2026-02-11 20:03:10 -08:00
Luke Mino-Altherr
7519a556df
Add optional blake3 hashing during asset scanning
...
- Make blake3 import lazy in hashing.py (only imported when needed)
- Add compute_hashes parameter to AssetSeeder.start(), build_asset_specs(), and seed_assets()
- Fix missing tag clearing: include is_missing states in sync when update_missing_tags=True
- Clear is_missing flag on cache states when files are restored with matching mtime/size
- Fix validation error serialization in routes.py (use json.loads(ve.json()))
Amp-Thread-ID: https://ampcode.com/threads/T-019c3614-56d4-74a8-a717-19922d6dbbee
Co-authored-by: Amp <amp@ampcode.com >
2026-02-11 17:41:38 -08:00
Luke Mino-Altherr
e2b8200a29
Fix type annotation: use Callable[[str], bool] instead of callable
...
Amp-Thread-ID: https://ampcode.com/threads/T-019c354d-d627-7233-864d-1e6f7a4b8caa
Co-authored-by: Amp <amp@ampcode.com >
2026-02-11 17:41:38 -08:00
Luke Mino-Altherr
24e07008a1
Consolidate duplicate delete_temp_file_if_exists function
...
- Remove duplicate from routes.py
- Import from upload.py instead
- Rename to public API (remove leading underscore)
Amp-Thread-ID: https://ampcode.com/threads/T-019c3549-c245-7628-950c-dd6826185394
Co-authored-by: Amp <amp@ampcode.com >
2026-02-11 17:41:38 -08:00
Luke Mino-Altherr
8c4eb9a659
refactor(assets): consolidate duplicated query utilities and remove unused code
...
- Extract shared helpers to database/queries/common.py:
- MAX_BIND_PARAMS, calculate_rows_per_statement, iter_chunks, iter_row_chunks
- build_visible_owner_clause
- Remove duplicate _compute_filename_for_asset, consolidate in path_utils.py
- Remove unused get_asset_info_with_tags (duplicated get_asset_detail)
- Remove redundant __all__ from cache_state.py
- Make internal helpers private (_check_is_scalar)
Amp-Thread-ID: https://ampcode.com/threads/T-019c2ad9-9432-7451-94a8-79287dbbb19e
Co-authored-by: Amp <amp@ampcode.com >
2026-02-11 17:41:38 -08:00
Luke Mino-Altherr
4695694263
chore: remove obvious/self-documenting comments from assets package
...
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-02-11 17:41:38 -08:00
Luke Mino-Altherr
a6a8d3ad74
chore: remove module-level comments and docstrings from assets package
...
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-02-11 17:41:38 -08:00
Luke Mino-Altherr
915d21afcb
refactor: improve function naming for clarity and consistency
...
Rename functions to use clearer verb-based names:
- pick_best_live_path → select_best_live_path
- escape_like_prefix → escape_sql_like_string
- list_tree → list_files_recursively
- check_asset_file_fast → verify_asset_file_unchanged
- _seed_from_paths_batch → _batch_insert_assets_from_paths
- reconcile_cache_states_for_root → sync_cache_states_with_filesystem
- touch_asset_info_by_id → update_asset_info_access_time
- replace_asset_info_metadata_projection → set_asset_info_metadata
- expand_metadata_to_rows → convert_metadata_to_rows
- _rows_per_stmt → _calculate_rows_per_statement
- ensure_within_base → validate_path_within_base
- _cleanup_temp → _delete_temp_file_if_exists
- validate_hash_format → normalize_and_validate_hash
- get_relative_to_root_category_path_of_asset → get_asset_category_and_relative_path
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-02-11 17:41:37 -08:00
Luke Mino-Altherr
c546e9315b
fix: ruff linting errors and add comprehensive test coverage for asset queries
...
- Fix unused imports in routes.py, asset.py, manager.py, asset_management.py, ingest.py
- Fix whitespace issues in upload.py, asset_info.py, ingest.py
- Fix typo in manager.py (stray character after result["asset"])
- Fix broken import in test_metadata.py (project_kv moved to asset_info.py)
- Add fixture override in queries/conftest.py for unit test isolation
Add 48 new tests covering all previously untested query functions:
- asset.py: upsert_asset, bulk_insert_assets
- cache_state.py: upsert_cache_state, delete_cache_states_outside_prefixes,
get_orphaned_seed_asset_ids, delete_assets_by_ids, get_cache_states_for_prefixes,
bulk_set_needs_verify, delete_cache_states_by_ids, delete_orphaned_seed_asset,
bulk_insert_cache_states_ignore_conflicts, get_cache_states_by_paths_and_asset_ids
- asset_info.py: insert_asset_info, get_or_create_asset_info,
update_asset_info_timestamps, replace_asset_info_metadata_projection,
bulk_insert_asset_infos_ignore_conflicts, get_asset_info_ids_by_ids
- tags.py: bulk_insert_tags_and_meta
Total: 119 tests pass (up from 71)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-02-11 17:41:37 -08:00
Luke Mino-Altherr
7c854e5ca0
refactor: extract multipart upload parsing from routes
...
- Add app/assets/api/upload.py with parse_multipart_upload() for HTTP parsing
- Add ParsedUpload dataclass to schemas_in.py
- Add domain exceptions (AssetValidationError, AssetNotFoundError, HashMismatchError)
- Add manager.process_upload() with domain exceptions (no HTTP status codes)
- Routes map domain exceptions to HTTP responses
- Slim down upload_asset route to ~20 lines (was ~150)
Amp-Thread-ID: https://ampcode.com/threads/T-019c2519-abe1-738a-ad2e-29ece17c0e42
Co-authored-by: Amp <amp@ampcode.com >
2026-02-11 17:41:37 -08:00