Commit Graph

  • b61b2ee676 feat(browser-profiler): implement cross-platform keyboard listeners and improve quit handling AHMET YILMAZ 2025-08-08 11:18:34 +08:00
  • 0541b61405 feat(browser-profiler): implement cross-platform keyboard listeners and improve quit handling fix/exit_with_q AHMET YILMAZ 2025-08-08 11:18:34 +08:00
  • 66925eb1d6 fix(deep_crawling): fix priority queue ordering and link truncation in BestFirstCrawlingStrategy - ref #1253 fix/deep-crawl-scoring-priority ntohidi 2025-08-07 15:28:43 +08:00
  • 89cf5aba2b #1057 : enhance ProxyConfig initialization to support dict and string formats bug/proxy_config AHMET YILMAZ 2025-08-06 18:34:23 +08:00
  • 6b0b5301ba Release v0.7.3: ntohidi 2025-08-06 17:52:01 +08:00
  • 7a8190ecb6 Fix examples in README.md Nezar Ali 2025-08-06 11:58:29 +03:00
  • 64f37792a7 Merge pull request #1170 from prokopis3/fix/create-profile Nasrin 2025-08-06 16:29:14 +08:00
  • 6735c68288 Merge pull request #1170 from prokopis3/fix/create-profile Nasrin 2025-08-06 16:29:14 +08:00
  • a5bcac4c9d feat(docs): enhance table data access example with a real url ntohidi 2025-08-06 15:19:37 +08:00
  • 45d8327d23 Merge pull request #1366 from unclecode/fix/update-tables-documentation Nasrin 2025-08-06 15:15:24 +08:00
  • 437395e490 Merge branch 'feat/undetected-browser' into develop-future ntohidi 2025-08-06 15:03:30 +08:00
  • fddae303fb docs: Update README.md and modify Media and Tables Documentation.(#1271) - Update Table-to-DataFrame Extraction example in README.md - Replace old method of accessing tables via result.media directly with result.tables in the documentation - Remove tables section from links & media page. - Add tables section to crawler result page. Soham Kukreti 2025-08-05 23:23:17 +05:30
  • 660d7011b9 In obtaining cleaned_html, the tag "script" needs to be processed separately. lizhuxiong 2025-08-05 16:27:03 +08:00
  • 6d3444ba17 In obtaining cleaned_html, the tag "script" needs to be processed separately. lizhuxiong 2025-08-05 16:18:34 +08:00
  • ff6ea41ac3 feat(docker): add flexible LLM provider configuration ntohidi 2025-08-05 14:09:54 +08:00
  • 31a435fb0e Merge branch 'develop' of https://github.com/unclecode/crawl4ai into develop ntohidi 2025-08-04 19:12:19 +08:00
  • 5de6a28055 Merge pull request #1361 from unclecode/fix/crawler-result-docs Nasrin 2025-08-04 19:12:09 +08:00
  • de1561ad14 Merge branch 'develop' of https://github.com/unclecode/crawl4ai into develop ntohidi 2025-08-04 19:04:50 +08:00
  • 337b588732 Merge pull request #1358 from shonenada/patch-1 Nasrin 2025-08-04 19:04:42 +08:00
  • 7a6ad547f0 Squashed commit of the following: ntohidi 2025-08-04 19:02:01 +08:00
  • e6692b987d docs: Update CrawlResult documentation with missing fields. - Add missing fields: fit_html, js_execution_result, redirected_url, network_requests, console_messages, tables Soham Kukreti 2025-08-04 15:40:33 +05:30
  • 307fe28b32 fix: Correct URL matcher fallback behavior and improve memory monitoring ntohidi 2025-08-03 16:50:54 +08:00
  • 438a103b17 Fix typos in examples.md Yaoda Liu 2025-08-03 14:33:10 +08:00
  • a03e68fa2f feat: Add URL-specific crawler configurations for multi-URL crawling ntohidi 2025-08-02 19:10:36 +08:00
  • 864d87afb2 Merge pull request #1339 from charlaie/fix-sitemap-redirect Nasrin 2025-07-31 15:21:03 +08:00
  • 508b6fc233 fix: Enable following redirects in sitemap fetching for seeder Charlie C 2025-07-25 15:57:09 +08:00
  • 8a906fcad0 fix(dependencies): Update and clean up package versions in pyproject.toml, the bundle size will be much smaller. next UncleCode 2025-07-29 19:56:27 +08:00
  • 54ae10d957 feat(extraction_strategy): Enhance schema generation with improved validation and task description handling UncleCode 2025-07-29 19:33:36 +08:00
  • 8e3c411a3e Merge branch 'main' into main Emmanuel Ferdman 2025-07-29 14:05:35 +03:00
  • e3281935bc fix: Add write permissions for GitHub release creation UncleCode 2025-07-25 18:22:45 +08:00
  • 48647300b4 chore: Bump version to 0.7.2 v0.7.2 release/v0.7.2 UncleCode 2025-07-25 17:42:48 +08:00
  • 9f9ea3bb3b chore: Clean up test artifacts and disable test workflow release/v0.7.1 UncleCode 2025-07-25 17:31:52 +08:00
  • d58b93c207 fix: Re-enable multi-platform Docker builds for ARM64 support UncleCode 2025-07-25 16:38:11 +08:00
  • e2b4705010 fix: Use hardcoded Docker repository name to avoid masking issues UncleCode 2025-07-25 15:52:26 +08:00
  • 4a1abd5086 fix: Handle existing version on Test PyPI gracefully UncleCode 2025-07-25 15:41:16 +08:00
  • 04258cd4f2 fix: Speed up Docker test builds by using single platform and caching UncleCode 2025-07-25 15:37:44 +08:00
  • 84e462d9f8 Merge remote-tracking branch 'origin/develop' UncleCode 2025-07-25 15:35:53 +08:00
  • 9546773a07 fix: Move sentence-transformers to optional dependencies UncleCode 2025-07-24 21:24:40 +08:00
  • 66a979ad11 fix: Install dependencies before version check in workflows UncleCode 2025-07-24 21:01:36 +08:00
  • 0c31e91b53 feat: Add CI/CD workflows for automated PyPI and Docker releases UncleCode 2025-07-24 20:58:43 +08:00
  • 843457a9cb Refactor adaptive crawling state management UncleCode 2025-07-24 20:11:43 +08:00
  • 1b6a31f88f fix: encode PDF results to base64 in /crawl endpoint. ref #1301 ntohidi 2025-07-23 13:52:18 +02:00
  • b8c261780f Merge pull request #1319 from volumetric/fix_for_bug_#1310 Nasrin 2025-07-23 12:45:12 +02:00
  • db6ad7a79d fix: update links in README and C4A-Script documentation for accuracy ntohidi 2025-07-23 09:47:18 +02:00
  • 004d514f33 Merge pull request #1265 from unclecode/feature/nasrin-cli-deep-crawl Nasrin 2025-07-23 09:40:33 +02:00
  • d1de82a332 feat(crawl4ai): Implement SMART cache mode UncleCode 2025-07-21 21:19:37 +08:00
  • 8a04351406 feat(crawl4ai): Update to version 0.7.1 with improvements and new tests UncleCode 2025-07-18 16:27:19 +08:00
  • 3a9e2c716e Remvoed the incorrect reference in browser_config variable Vinit Agrawal 2025-07-18 10:01:00 +05:30
  • 0163bd797c Merge branch 'release/v0.7.1' v0.7.1 unclecode 2025-07-17 17:42:04 +08:00
  • 26bad799e4 chore: update version to 0.7.1 ntohidi 2025-07-17 11:37:41 +02:00
  • cf8badfe27 feat: cleanup unused code and enhance documentation for v0.7.1 ntohidi 2025-07-17 11:35:16 +02:00
  • 805c498adf docs: add simple anti-bot examples feat/undetected-browser unclecode 2025-07-17 17:05:35 +08:00
  • 6a728cbe5b feat: add stealth mode and enhance undetected browser support unclecode 2025-07-17 16:59:10 +08:00
  • ccbe3c105c refactor: improve link scoring output format in release notes ntohidi 2025-07-17 09:13:20 +02:00
  • 761c19d54b Merge pull request #1307 from unclecode/fix/json-infinity-serialization Nasrin 2025-07-16 13:34:25 +02:00
  • 14b0ecb137 Merge pull request #1305 from unclecode/fix/release-notes-demo-code Nasrin 2025-07-16 13:33:53 +02:00
  • 65902a4773 feat: Enhance stealth compatibility with new and legacy APIs, add configuration support fix/playwright-stealth AHMET YILMAZ 2025-07-16 17:41:47 +08:00
  • 0eaa9f9895 fix: handle infinity values in JSON serialization for API responses fix/json-infinity-serialization ntohidi 2025-07-15 13:49:07 +02:00
  • 1d1970ae69 docs: Update release notes and docs for v0.7.0 with teh correct parameters and explanations fix/release-notes-demo-code ntohidi 2025-07-15 11:32:04 +02:00
  • 205df1e330 docs: Fix virtual scroll configuration ntohidi 2025-07-15 10:29:47 +02:00
  • 2640dc73a5 docs: Enhance session management example for dynamic content crawling with improved JavaScript handling and extraction schema. ref #226 ntohidi 2025-07-15 10:19:29 +02:00
  • 58024755c5 docs: Update adaptive crawling parameters and examples in README and release notes ntohidi 2025-07-15 10:15:05 +02:00
  • 5c13baf574 feat: Add stealth option to BrowserConfig for enhanced browser behavior AHMET YILMAZ 2025-07-15 15:48:23 +08:00
  • d2759824ef fix: Update playwright-stealth to v2.0.0+ compatibility AHMET YILMAZ 2025-07-15 15:09:53 +08:00
  • 5c33cbcca2 feat: add undetected browser support with adapter pattern unclecode 2025-07-14 17:29:50 +08:00
  • 83b323f13a fix VersionManager not using CRAWL4_AI_BASE_DIRECTORY Vladimir Mandic 2025-07-12 17:40:34 -04:00
  • dd5ee752cf docs: Add missing documentation pages to mkdocs.yml UncleCode 2025-07-12 19:58:26 +08:00
  • bde1bba6a2 docs: Add missing documentation pages to mkdocs.yml UncleCode 2025-07-12 19:56:33 +08:00
  • 7b80eb6b99 docs: Add missing documentation pages to mkdocs.yml UncleCode 2025-07-12 19:55:35 +08:00
  • 14f690d751 docs: Update documentation for v0.7.0 release UncleCode 2025-07-12 19:08:17 +08:00
  • 7b9ba3015f Merge branch 'release/v0.7.0' - The Adaptive Intelligence Update v0.7.0 UncleCode 2025-07-12 18:54:20 +08:00
  • 0c8bb742b7 Release v0.7.0-r1: The Adaptive Intelligence Update release/v0.7.0 UncleCode 2025-07-12 18:51:13 +08:00
  • ba2ed53ff1 test(releases): Add test cases for release 0.7.0 UncleCode 2025-07-11 22:27:18 +08:00
  • a93efcb650 Merge PR #1285: 2025 APR, MAY, and JUN bug fixes UncleCode 2025-07-11 21:22:34 +08:00
  • 8794852a26 Merge PR #1285: 2025 APR, MAY, and JUN bug fixes UncleCode 2025-07-11 21:22:03 +08:00
  • fb25a4a769 docs(examples): update crawl4ai showcase script UncleCode 2025-07-11 20:55:37 +08:00
  • afe852935e fix: show /llm API response in playground. ref #1288 next-MAY ntohidi 2025-07-09 16:59:17 +02:00
  • 0ebce590f8 Merge branch '2025-JUN-1' into next-MAY ntohidi 2025-07-09 09:41:03 +02:00
  • 026e96a2df feat: Add social media and community links to README and index documentation ntohidi 2025-07-08 15:48:40 +02:00
  • 36429a63de fix: Improve comments for article metadata extraction in extract_metadata functions. ref #1105 ntohidi 2025-07-08 12:54:33 +02:00
  • a3d41c7951 fix: Clarify description of 'use_stemming' parameter in markdown generation documentation ref #1086 ntohidi 2025-07-08 12:24:33 +02:00
  • 1d6efb622d Fix proxy authentication ERR_INVALID_AUTH_CREDENTIALS Gary 2025-07-08 17:55:28 +08:00
  • fee4c5c783 fix: Consolidate import statements in local-files.md for clarity ntohidi 2025-07-08 11:46:24 +02:00
  • 0f210f6e02 Merge branch '2025-MAY-2' into next-MAY ntohidi 2025-07-08 11:46:13 +02:00
  • 1a73fb60db feat(crawl4ai): Implement adaptive crawling feature next-JUN UncleCode 2025-07-04 15:16:53 +08:00
  • 74705c1f67 Move release scripts to private .scripts folder UncleCode 2025-07-04 15:02:25 +08:00
  • 048d9b0f5b feat: Implement nightly build script and update version handling UncleCode 2025-07-03 20:53:03 +08:00
  • ee25c771d8 feat(cli): add deep crawling options with configurable strategies and max pages. ref #874 feature/nasrin-cli-deep-crawl ntohidi 2025-07-02 14:07:23 +02:00
  • a353515271 feat: Add virtual scroll support for modern web scraping UncleCode 2025-06-29 20:41:37 +08:00
  • 539a324cf6 refactor(link_extractor): remove link_extractor and rename to link_preview UncleCode 2025-06-27 21:54:22 +08:00
  • 5c9c305dbf feat: Add advanced link head extraction with three-layer scoring system (#1) UncleCode 2025-06-27 20:06:04 +08:00
  • 02f3127ded Track Stargazers (#1249) Aravind 2025-06-25 19:56:19 +05:30
  • e528086341 test(async_assistant): add new tests for extract pipeline UncleCode 2025-06-23 10:44:27 +08:00
  • 414f16e975 fix: Update pdf and screenshot usage documentation. ref #1230 ntohidi 2025-06-18 19:05:44 +02:00
  • b7a6e02236 fix: Update pdf and screenshot usage documentation. ref #1230 ntohidi 2025-06-18 19:04:32 +02:00
  • 9332326457 feat: Add PDF parsing documentation and navigation entry 2025-JUN-1 AHMET YILMAZ 2025-06-16 18:18:32 +08:00
  • 6cd34b3157 Merge branch '2025-MAY-2' of https://github.com/unclecode/crawl4ai into 2025-MAY-2 ntohidi 2025-06-13 11:26:17 +02:00
  • 871d4f1158 fix(extraction_strategy): rename response variable to content for clarity in LLMExtractionStrategy. ref #1146 ntohidi 2025-06-13 11:26:05 +02:00
  • c4d625fb3c chore(profile-test): fix filename typo ( test_crteate_profile.py → test_create_profile.py ) prokopis3 2025-06-12 14:38:32 +03:00
  • ef722766f0 fix(browser_profiler): improve keyboard input handling prokopis3 2025-06-12 14:33:12 +03:00