crawl4ai

mirror of https://github.com/unclecode/crawl4ai.git synced 2026-06-10 15:58:15 +00:00

Files

UncleCode a2061bf31e feat(crawler): add MHTML capture functionality

Add ability to capture web pages as MHTML format, which includes all page resources
in a single file. This enables complete page archival and offline viewing.

- Add capture_mhtml parameter to CrawlerRunConfig
- Implement MHTML capture using CDP in AsyncPlaywrightCrawlerStrategy
- Add mhtml field to CrawlResult and AsyncCrawlResponse models
- Add comprehensive tests for MHTML capture functionality
- Update documentation with MHTML capture details
- Add exclude_all_images option for better memory management

Breaking changes: None

2025-04-09 15:39:04 +08:00

assets

feat(core): release version 0.5.0 with deep crawling and CLI

2025-02-21 19:55:02 +08:00

deprecated

docs: update README badges and Docker section, reorganize documentation structure

2024-12-31 19:45:02 +08:00

examples

Merge branch 'main' into next

2025-04-08 17:43:42 +08:00

md_v2

feat(crawler): add MHTML capture functionality

2025-04-09 15:39:04 +08:00

releases_review

feat(proxy): add proxy rotation strategy

2025-02-09 18:49:10 +08:00

snippets/deep_crawl

refactor(proxy): consolidate proxy configuration handling

2025-03-07 23:14:11 +08:00