mirror of
https://github.com/unclecode/crawl4ai.git
synced 2026-06-09 23:47:15 +00:00
0.8.8's SSRF check validated the crawl target URL but not the proxy address, so an unauthenticated /crawl, /crawl/stream, or /crawl/job could route the browser through a proxy pointing at an internal IP and reach internal services / cloud metadata. Reported by Geo (geo-chen). Fix (backward compatible): validate every proxy destination with the same not-is_global check used for crawl URLs, before the browser is built - browser_config.proxy, browser_config.proxy_config.server, crawler_config.proxy_config.server - and strip proxy/DNS-redirecting flags (--proxy-server / --proxy-pac-url / --proxy-bypass-list / --host-resolver-rules) from extra_args. A legitimate public proxy still works; configure proxies via proxy_config (validated), not raw extra_args flags. _enforce_proxy_safety is called in both crawl handlers (and covers /crawl/job transitively); HTTPException passthrough added so the 400 is not masked as a 500. Bump 0.8.8 -> 0.8.9 (__version__ + Dockerfile). 20 new tests; full security suite 161 pass. Changelog, release blog, README, SECURITY-CREDITS updated. This vector was already fixed in the upcoming secure-by-default release; 0.8.9 brings it forward because it is an unauthenticated SSRF.