Commit Graph

  • d22825eea4 Fix: add cdp_cleanup_on_close to from_kwargs unclecode 2025-12-13 06:33:26 +00:00
  • 66941a59e8 Add cdp_cleanup_on_close flag to prevent memory leaks in cloud/server scenarios unclecode 2025-12-13 06:25:25 +00:00
  • 8ae908bede Add browser_context_id and target_id parameters to BrowserConfig unclecode 2025-12-13 02:42:48 +00:00
  • 306ddcbf3d Merge branch 'main' into develop ntohidi 2025-12-11 11:18:30 +01:00
  • a87e8c1c9e Release/v0.7.8 (#1662) Nasrin 2025-12-11 18:04:52 +08:00
  • 61be862ab0 fix: add disk cleanup step to Docker workflow docker-rebuild-v0.7.8 release/v0.7.8 ntohidi 2025-12-11 10:28:15 +01:00
  • 835e3c56fe Add disk cleanup step in Docker release workflow UncleCode 2025-12-11 09:49:27 +01:00
  • 220a2246d3 When using --deep-crawl, output all pages, not just the first one. Christian Oudard 2025-12-10 10:12:01 -07:00
  • b0b2b2761c fix:Make JsonCssExtractionStrategy.generate_schema resilient to markdown tags generated by LLMs https://github.com/unclecode/crawl4ai/issues/1663 patch/generate_schema Aravind Karnam 2025-12-09 15:23:56 +05:30
  • 9672afded2 docs: add section for Crawl4AI Cloud API closed beta with application link ntohidi 2025-12-09 10:27:15 +01:00
  • 60d6173914 Merge pull request #1661 from unclecode/waitlist v0.7.8 Nasrin 2025-12-09 16:44:15 +08:00
  • 48c31c4cb9 Release v0.7.8: Stability & Bug Fix Release ntohidi 2025-12-08 15:42:29 +01:00
  • 48b6283e71 announcement: add application form for cloud API closed beta Aravind Karnam 2025-12-08 14:00:57 +05:30
  • 5a8fb57795 Merge pull request #1648 from christopher-w-murphy/fix/content-relevance-filter Nasrin 2025-12-03 18:36:07 +08:00
  • df4d87ed78 refactor: replace PyPDF2 with pypdf across the codebase. ref #1412 ntohidi 2025-12-03 10:59:18 +01:00
  • f32cfc6db0 Merge pull request #1645 from unclecode/fix/configurable-backoff Nasrin 2025-12-02 21:07:49 +08:00
  • d06c39e8ab Merge pull request #1641 from unclecode/fix/serialize-proxy-config Nasrin 2025-12-02 21:06:02 +08:00
  • afc31e144a Merge branch 'develop' of https://github.com/unclecode/crawl4ai into develop ntohidi 2025-12-02 13:01:11 +01:00
  • 07ccf13be6 Fix: capture current page URL to reflect JavaScript navigation and add test for delayed redirects. ref #1268 ntohidi 2025-12-02 13:00:54 +01:00
  • 3a07c5962c Sponsors/new (#1643) Aravind 2025-12-02 05:19:39 +05:30
  • 6893094f58 parameterized tests Chris Murphy 2025-12-01 16:19:19 -05:00
  • 3a8f8298d3 import modules from enhanceable deserialization Chris Murphy 2025-12-01 16:18:59 -05:00
  • e95e8e1a97 generalized query in ContentRelevanceFilter to be a str or list Chris Murphy 2025-12-01 16:16:31 -05:00
  • eb76df2c0d added missing deep crawling objects to init Chris Murphy 2025-12-01 16:15:58 -05:00
  • 6ec6bc4d8a pass timeout parameter to docker client request Chris Murphy 2025-12-01 16:15:27 -05:00
  • 33a3cc3933 reproduced AttributeError from #1642 Chris Murphy 2025-12-01 11:31:07 -05:00
  • 7a133e22cc feat: make LLM backoff configurable end-to-end fix/configurable-backoff Soham Kukreti 2025-11-28 18:50:04 +05:30
  • dcb77c94bf Merge pull request #1623 from unclecode/fix/deprecated_pydantic Nasrin 2025-11-27 20:05:42 +08:00
  • 6695a21a41 Fix: enhance fallback scoring for failed head extraction in LinkPreview. ref #1638 fix/linkPreviewScoring ntohidi 2025-11-27 12:14:08 +01:00
  • a0c5f0f79a fix: ensure BrowserConfig.to_dict serializes proxy_config fix/serialize-proxy-config Soham Kukreti 2025-11-26 17:44:06 +05:30
  • 6eb3baed50 feat: Add ConfigHealthMonitor for automated crawler configuration health monitoring feature/configHealthMonitor Soham Kukreti 2025-11-25 23:49:15 +05:30
  • b36c6daa5c Fix: permission issues with .cache/url_seeder and other runtime cache dirs. ref #1638 ntohidi 2025-11-25 11:51:59 +01:00
  • 94c8a833bf Merge pull request #1447 from rbushri/fix/wrong_url_raw Nasrin 2025-11-25 17:49:44 +08:00
  • 84bfea8bd1 Fix EmbeddingStrategy: Uncomment response handling for the variations and clean up mock data. ref #1621 ntohidi 2025-11-25 10:46:00 +01:00
  • 0024c82cdc Sponsors/new (#1637) Aravind 2025-11-24 17:59:33 +05:30
  • 7771ed3894 Merge branch 'develop' into fix/wrong_url_raw Rachel Bushrian 2025-11-24 13:54:07 +02:00
  • af77800a6b Implement CORS handling with --disable-web-security in BrowserManager and add corresponding tests fix-cors-disable-web-security AHMET YILMAZ 2025-11-18 16:18:49 +08:00
  • eca04b0368 Refactor Pydantic model configuration to use ConfigDict for arbitrary types fix/deprecated_pydantic AHMET YILMAZ 2025-11-18 15:40:17 +08:00
  • 43a2088eb0 Fix redirect target verification in AsyncUrlSeeder and enhance tests fix-async-url-seeder-redirect-verification AHMET YILMAZ 2025-11-18 11:43:47 +08:00
  • c2c4d42be4 Fix #1181: Preserve whitespace in code blocks during HTML scraping ntohidi 2025-11-17 12:21:23 +01:00
  • f68e7531e3 Sponsors/scrapeless (#1619) Aravind 2025-11-17 12:14:52 +05:30
  • cb637fb5c4 Merge pull request #1613 from unclecode/release/v0.7.7 UncleCode 2025-11-16 12:26:54 +01:00
  • 6244f56f36 Release v0.7.7 v0.7.7 docker-rebuild-v0.7.7 release/v0.7.7 ntohidi 2025-11-14 10:23:31 +01:00
  • 2c973b1183 Merge branch 'develop' into release/v0.7.7 ntohidi 2025-11-13 14:54:05 +01:00
  • f3146de969 Merge pull request #1609 from unclecode/fix/update-config-documentation Nasrin 2025-11-13 21:52:53 +08:00
  • d6b6d11a2d docs: update browser and crawler run config documentation to match async_configs.py implementation Soham Kukreti 2025-11-13 14:54:16 +05:30
  • b58579548c Bump version to 0.7.7 for stable release ntohidi 2025-11-13 09:52:18 +01:00
  • 466be69e72 Merge pull request #1607 from unclecode/fix/dfs_deep_crawling Nasrin 2025-11-13 16:43:47 +08:00
  • ceade853c3 Enhance DFSDeepCrawlStrategy documentation for clarity and detail fix/dfs_deep_crawling AHMET YILMAZ 2025-11-13 16:39:08 +08:00
  • 998c809e08 Rename folder name for NSTProxy integration examples for crawl4ai ntohidi 2025-11-13 09:36:39 +01:00
  • d0fb53540d Update proxy-security documentation ntohidi 2025-11-13 09:23:44 +01:00
  • 8116b15b63 Merge pull request #1596 from unclecode/docs-proxy-security Nasrin 2025-11-13 16:22:28 +08:00
  • fe353c4e27 Refactor proxy configuration documentation for clarity and consistency docs-proxy-security AHMET YILMAZ 2025-11-13 11:20:24 +08:00
  • 89cc29fe44 Merge branch 'fix/docker' into develop ntohidi 2025-11-12 17:06:31 +01:00
  • cdcb8836b7 Merge pull request #1605 from Nstproxy/feat/nstproxy Nasrin 2025-11-12 23:56:14 +08:00
  • b207ae2848 Merge pull request #1528 from unclecode/fix/managed-browser-cdp-timing Nasrin 2025-11-12 23:53:57 +08:00
  • be00fc3a42 Merge pull request #1598 from unclecode/fix/sitemap_seeder Nasrin 2025-11-12 18:09:34 +08:00
  • 124ac583bb Merge pull request #1599 from unclecode/docs-llm-strategies-update Nasrin 2025-11-12 17:54:26 +08:00
  • 1bd3de6a47 #1510 : Add DFS deep crawler demonstration script and enhance DFS strategy with seen URL tracking AHMET YILMAZ 2025-11-12 17:44:43 +08:00
  • 80452166c8 feat: Add Nstproxy Proxies nstproxy 2025-11-12 16:25:39 +08:00
  • a99cd37c0e Merge pull request #1597 from unclecode/sponsors/capsolver UncleCode 2025-11-11 14:50:44 +08:00
  • 2e8f8c9b49 #1551 : Fix casing and variable name consistency for LLMConfig in documentation docs-llm-strategies-update AHMET YILMAZ 2025-11-10 15:38:14 +08:00
  • 80745bceb9 #1559 :Add tests for sitemap parsing and URL normalization in AsyncUrlSeeder fix/sitemap_seeder AHMET YILMAZ 2025-11-10 14:15:54 +08:00
  • 4bee230c37 docs: Add a tip for captcha solving usecases using a third party integration Aravind Karnam 2025-11-10 11:20:48 +05:30
  • 006e29f308 Merge pull request #1589 from capsolver/main Aravind 2025-11-10 10:45:16 +05:30
  • 263ac890fd #1591 : Enhance proxy configuration documentation with security features, SSL analysis, and improved examples AHMET YILMAZ 2025-11-10 11:42:07 +08:00
  • 78120df47e chore: update .gitignore from main feature/agent-oai unclecode 2025-11-09 19:19:52 +08:00
  • 1a22fb4d4f docs: rename Docker deployment to self-hosting guide with comprehensive monitoring documentation fix/docker unclecode 2025-11-09 13:31:52 +08:00
  • 81b5312629 Update gitignore unclecode 2025-11-09 10:49:42 +08:00
  • c003cb6e4f fix #1563 (cdp): resolve page leaks and race conditions in concurrent crawling bugfix/arun-many-cdp-managed-browser AHMET YILMAZ 2025-11-07 15:42:37 +08:00
  • d56b0eb9a9 Merge pull request #1495 from unclecode/fix/viewport_in_managed_browser Nasrin 2025-11-06 18:42:45 +08:00
  • 66175e132b Merge pull request #1590 from unclecode/fix/async-llm-extraction-arunMany Nasrin 2025-11-06 18:40:42 +08:00
  • a30548a98f This commit resolves issue #1055 where LLM extraction was blocking async execution, causing URLs to be processed sequentially instead of in parallel. fix/async-llm-extraction-arunMany ntohidi 2025-11-06 11:22:45 +01:00
  • c1c5dfc49b Add smoke test and comprehensive documentation copilot/modify-page-creation-and-logging copilot-swe-agent[bot] 2025-11-06 08:20:39 +00:00
  • 2507720cc7 Refactor imports for PEP 8 compliance and clarity copilot-swe-agent[bot] 2025-11-06 08:18:48 +00:00
  • 7037021496 Implement CDP concurrency fixes and improve logging copilot-swe-agent[bot] 2025-11-06 08:11:15 +00:00
  • 7c751837ef Initial plan copilot-swe-agent[bot] 2025-11-06 08:02:54 +00:00
  • 2ae9899eac Clarify CapSolver integration instructions CapSolver 2025-11-06 15:49:30 +08:00
  • 57aeb70f00 Add CapSolver Captcha Solver CapSolver 2025-11-06 15:37:31 +08:00
  • 2c918155aa Merge pull request #1529 from unclecode/fix/remove_overlay_elements Nasrin 2025-11-06 00:10:32 +08:00
  • 854694ef33 Merge pull request #1537 from unclecode/fix/docker-compose-llm-env Nasrin 2025-11-06 00:07:51 +08:00
  • 6534ece026 Merge pull request #1532 from unclecode/fix/update-documentation Nasrin 2025-11-05 23:37:05 +08:00
  • 61a18e01dc #1563 fix(browser): ensure new pages are created for managed browser concurrency fix/cdp AHMET YILMAZ 2025-10-29 17:45:41 +08:00
  • 977f7156aa fix(browser): ensure new pages are created for managed browser concurrency AHMET YILMAZ 2025-10-29 17:45:41 +08:00
  • 89e28d4eee Merge pull request #1558 from unclecode/claude/fix-update-pyopenssl-security-011CUPexU25DkNvoxfu5ZrnB Nasrin 2025-10-28 17:09:11 +08:00
  • 05ec0535cd #1564: fix issue with _sig when using proxyConfig docker/fix_sig AHMET YILMAZ 2025-10-28 14:05:26 +08:00
  • 83aeb565ee refactor(crawler_pool): enhance signature generation with adapter support and improve error handling AHMET YILMAZ 2025-10-27 17:02:26 +08:00
  • a3b02be5c3 #1564 fix: Improve error handling in browser configuration serialization and cleanup logic docker/add_features AHMET YILMAZ 2025-10-27 17:02:26 +08:00
  • c0f1865287 feat(api): update marketplace version and build date in root endpoint response ntohidi 2025-10-26 11:35:39 +01:00
  • 46ef1116c4 fix(app-detail): enhance tab functionality, hide documentation and support tabs in marketplace ntohidi 2025-10-26 11:21:29 +01:00
  • 0c95411aef Merge branch 'develop' into feature/docker-cluster feature/docker-cluster unclecode 2025-10-24 12:33:45 +08:00
  • 6114b9c3f4 Update gitignore unclecode 2025-10-24 12:30:33 +08:00
  • 4df83893ac Merge pull request #1560 from unclecode/fix/marketplace Nasrin 2025-10-23 22:17:06 +08:00
  • 13e116610d fix(marketplace): improve app detail page content rendering and UX fix/marketplace ntohidi 2025-10-23 16:12:30 +02:00
  • 613097d121 test: add verification tests for pyOpenSSL security update claude/fix-update-pyopenssl-security-011CUPexU25DkNvoxfu5ZrnB Claude 2025-10-23 06:57:25 +00:00
  • 44ef0682b0 fix: update pyOpenSSL to >=25.3.0 to address security vulnerability Claude 2025-10-23 06:51:25 +00:00
  • 589339a336 docs: add AI-optimized architecture map and quick start cheat sheet unclecode 2025-10-23 12:20:07 +08:00
  • 40173eeb73 Update Docker hooks and Webhook documents (#1557) Nasrin 2025-10-22 22:34:19 +08:00
  • b74524fdfb docs: update docker_hooks_examples.py with comprehensive examples and improved structure ntohidi 2025-10-22 16:29:19 +02:00
  • bcac486921 docs: enhance README and docker-deployment documentation with Job Queue and Webhook API details ntohidi 2025-10-22 16:19:30 +02:00