Caught during internal review. `http://[::ffff:127.0.0.1]/` bypassed
validate_webhook_url because getaddrinfo returns ::ffff:7f00:1, which
is not in any IPv4 blocklist (127.0.0.0/8) nor IPv6 blocklist (::1/128).
Fix: added _expand_ip_candidates() helper that unwraps IPv4 from
IPv4-mapped (::ffff:X.Y.Z.W, via .ipv4_mapped) and IPv4-compatible
(::X.Y.Z.W, via low-32-bits) IPv6 addresses. Blocklist now checks
both the original IP and the unwrapped IPv4 form.
Added 6 new TestIPv6MappedBypass tests covering:
- Loopback, RFC 1918, link-local (cloud metadata) via ::ffff: mapping
- IPv4-compatible variant (::127.0.0.1)
- Regression test that plain ::1 still blocked
Also updated stale test assertion in test_eval_security_adversarial:
hasattr, type, __build_class__ were removed from hook builtins in
batch 2 but the test still expected hasattr to remain.
DO NOT PUSH until release day.
Reported by secsys_codex (2026-04-18): /md, /crawl, /llm endpoints
pass user URLs to crawler.arun() with no private IP validation.
- Add validate_url_destination() to utils.py with opt-out via
CRAWL4AI_ALLOW_INTERNAL_URLS=true env var for users who need
to crawl internal services.
- Integrate into validate_url_scheme() (covers all server.py endpoints).
- Add validation at all 4 URL entry points in api.py (handle_llm_qa,
handle_markdown_request, create_new_task, handle_crawl_request).
- raw: URLs bypass check (inline HTML, no network fetch).
- 16 adversarial + source coverage tests added.
- secsys_codex added to SECURITY-CREDITS.md.
DO NOT PUSH until release day.
- Replace raw eval() in _compute_field() with AST-validated
_safe_eval_expression() that blocks __import__, dunder attribute
access, and import statements while preserving safe transforms
- Add ALLOWED_DESERIALIZE_TYPES allowlist to from_serializable_dict()
preventing arbitrary class instantiation from API input
- Update security contact email and add v0.8.1 security fixes to
SECURITY.md with researcher acknowledgment
- Add 17 security tests covering both fixes
Security fixes for vulnerabilities reported by ProjectDiscovery:
1. Remote Code Execution via Hooks (CVE pending)
- Remove __import__ from allowed_builtins in hook_manager.py
- Prevents arbitrary module imports (os, subprocess, etc.)
- Hooks now disabled by default via CRAWL4AI_HOOKS_ENABLED env var
2. Local File Inclusion via file:// URLs (CVE pending)
- Add URL scheme validation to /execute_js, /screenshot, /pdf, /html
- Block file://, javascript:, data: and other dangerous schemes
- Only allow http://, https://, and raw: (where appropriate)
3. Security hardening
- Add CRAWL4AI_HOOKS_ENABLED=false as default (opt-in for hooks)
- Add security warning comments in config.yml
- Add validate_url_scheme() helper for consistent validation
Testing:
- Add unit tests (test_security_fixes.py) - 16 tests
- Add integration tests (run_security_tests.py) for live server
Affected endpoints:
- POST /crawl (hooks disabled by default)
- POST /crawl/stream (hooks disabled by default)
- POST /execute_js (URL validation added)
- POST /screenshot (URL validation added)
- POST /pdf (URL validation added)
- POST /html (URL validation added)
Breaking changes:
- Hooks require CRAWL4AI_HOOKS_ENABLED=true to function
- file:// URLs no longer work on API endpoints (use library directly)
- Introduced a demo script (`demo_monitor_dashboard.py`) to showcase various monitoring features through simulated activity.
- Implemented a test script (`test_monitor_demo.py`) to generate dashboard activity and verify monitor health and endpoint statistics.
- Added a logo image to the static assets for branding purposes.