fix: address code review feedback - round 2

- Reject path separators (/, \, os.sep) in tag components for defense-in-depth
- Add comment explaining double-relpath normalization trick
- Add _require_assets_feature_enabled decorator returning 503 when disabled
- Call asset_seeder.disable() when --enable-assets is not passed
- Add iter_chunks to bulk_update_needs_verify, bulk_update_is_missing,
  and delete_references_by_ids to respect SQLite bind param limits
- Fix CacheStateRow.size_bytes NULL coercion (0 -> None) to avoid
  false needs_verify flags on assets with unknown size
- Add PermissionError catch in delete_asset_tags route (403 vs 500)
- Add hash-is-None guard in delete_orphaned_seed_asset
- Validate from_asset_id in reassign_asset_references
- Initialize _prune_first in __init__, remove getattr workaround
- Cap error accumulation in _add_error to 200
- Remove confirmed dead code: seed_assets, compute_filename_for_asset,
  ALLOWED_ROOTS, AssetNotFoundError, SetTagsResult, update_enrichment_level,
  Asset.to_dict, AssetReference.to_dict, _AssetSeeder.enable

Amp-Thread-ID: https://ampcode.com/threads/T-019cb610-1b55-74b6-8dbb-381d73c387c0
Co-authored-by: Amp <amp@ampcode.com>
This commit is contained in:
Luke Mino-Altherr
2026-03-03 16:35:55 -08:00
parent 6294e3f44a
commit e136d2691a
12 changed files with 50 additions and 136 deletions

View File

@@ -360,50 +360,6 @@ def insert_asset_specs(specs: list[SeedAssetSpec], tag_pool: set[str]) -> int:
return result.inserted_refs
def seed_assets(
roots: tuple[RootType, ...],
enable_logging: bool = False,
compute_hashes: bool = False,
) -> None:
"""Scan the given roots and seed the assets into the database.
Args:
roots: Tuple of root types to scan (models, input, output)
enable_logging: If True, log progress and completion messages
compute_hashes: If True, compute blake3 hashes (slow for large files)
Note: This function does not mark missing assets.
Call mark_missing_outside_prefixes_safely separately if cleanup is needed.
"""
if not dependencies_available():
if enable_logging:
logging.warning("Database dependencies not available, skipping assets scan")
return
t_start = time.perf_counter()
existing_paths: set[str] = set()
for r in roots:
existing_paths.update(sync_root_safely(r))
paths = collect_paths_for_roots(roots)
specs, tag_pool, skipped_existing = build_asset_specs(
paths, existing_paths, compute_hashes=compute_hashes
)
created = insert_asset_specs(specs, tag_pool)
if enable_logging:
logging.info(
"Assets scan(roots=%s) completed in %.3fs "
"(created=%d, skipped_existing=%d, total_seen=%d)",
roots,
time.perf_counter() - t_start,
created,
skipped_existing,
len(paths),
)
# Enrichment level constants
ENRICHMENT_STUB = 0 # Fast scan: path, size, mtime only
ENRICHMENT_METADATA = 1 # Metadata extracted (safetensors header, mime type)