mirror of https://github.com/unclecode/crawl4ai.git synced 2026-06-10 15:58:15 +00:00

Files

unclecode 3a75dd3f4c fix: batch fix for 10 open issues (#1520 , #1489 , #1374 , #1424 , #1183 , #1354 , #880 , #1031 , #1251 , #1758 )

- #1520: Preserve trailing slashes in URL normalization (RFC 3986 compliance)
- #1489: Preserve query parameter key casing in normalize_url
- #1374: Close NamedTemporaryFile handle before reopening (Windows fix)
- #1424: Fix CosineStrategy returning empty results (delimiter fallback + at_least_k >= 1)
- #1183: Fix extract_xml_data regex matching tag names in prose text
- #1354: Make import_knowledge_base async (fix asyncio.run in running loop)
- #880: Fix 404 sample_ecommerce.html gist URL in docs (6 occurrences)
- #1031: Make Docker playground code editor resizable with overflow-auto
- #1251: Add DEFAULT_CONFIG with deep-merge in load_config to prevent KeyError crashes
- #1758: Change screenshot stitching format from BMP to PNG

2026-03-07 09:47:38 +00:00

advanced_configuration.py

feat(crawl4ai): Implement adaptive crawling feature

2025-07-04 15:16:53 +08:00

basic_usage.py

feat(crawl4ai): Implement adaptive crawling feature

2025-07-04 15:16:53 +08:00

custom_strategies.py

feat(crawl4ai): Implement adaptive crawling feature

2025-07-04 15:16:53 +08:00

embedding_configuration.py

feat(crawl4ai): Implement adaptive crawling feature

2025-07-04 15:16:53 +08:00

embedding_strategy.py

feat(crawl4ai): Implement adaptive crawling feature

2025-07-04 15:16:53 +08:00

embedding_vs_statistical.py

feat(crawl4ai): Implement adaptive crawling feature

2025-07-04 15:16:53 +08:00

export_import_kb.py

fix: batch fix for 10 open issues (#1520 , #1489 , #1374 , #1424 , #1183 , #1354 , #880 , #1031 , #1251 , #1758 )

2026-03-07 09:47:38 +00:00

llm_config_example.py

Release/v0.7.6 (#1556 )

2025-10-22 20:41:06 +08:00

README.md

feat(crawl4ai): Implement adaptive crawling feature

2025-07-04 15:16:53 +08:00

README.md

Adaptive Crawling Examples

This directory contains examples demonstrating various aspects of Crawl4AI's Adaptive Crawling feature.

Examples Overview

1. `basic_usage.py`

Simple introduction to adaptive crawling
Uses default statistical strategy
Shows how to get crawl statistics and relevant content

2. `embedding_strategy.py` ⭐ NEW

Demonstrates the embedding-based strategy for semantic understanding
Shows query expansion and irrelevance detection
Includes configuration for both local and API-based embeddings

3. `embedding_vs_statistical.py` ⭐ NEW

Direct comparison between statistical and embedding strategies
Helps you choose the right strategy for your use case
Shows performance and accuracy trade-offs

4. `embedding_configuration.py` ⭐ NEW

Advanced configuration options for embedding strategy
Parameter tuning guide for different scenarios
Examples for research, exploration, and quality-focused crawling

5. `advanced_configuration.py`

Shows various configuration options for both strategies
Demonstrates threshold tuning and performance optimization

6. `custom_strategies.py`

How to implement your own crawling strategy
Extends the base CrawlStrategy class
Advanced use case for specialized requirements

7. `export_import_kb.py`

Export crawled knowledge base to JSONL
Import and continue crawling from saved state
Useful for building persistent knowledge bases

Quick Start

For your first adaptive crawling experience, run:

python basic_usage.py

To try the new embedding strategy with semantic understanding:

python embedding_strategy.py

To compare strategies and see which works best for your use case:

python embedding_vs_statistical.py

README.md

Adaptive Crawling Examples

Examples Overview

1. `basic_usage.py`

2. `embedding_strategy.py` ⭐ NEW

3. `embedding_vs_statistical.py` ⭐ NEW

4. `embedding_configuration.py` ⭐ NEW

5. `advanced_configuration.py`

6. `custom_strategies.py`

7. `export_import_kb.py`

Quick Start

Strategy Selection Guide

Use Statistical Strategy (Default) When:

Use Embedding Strategy When:

Requirements

Learn More

README.md

Adaptive Crawling Examples

Examples Overview

1. basic_usage.py

2. embedding_strategy.py ⭐ NEW

3. embedding_vs_statistical.py ⭐ NEW

4. embedding_configuration.py ⭐ NEW

5. advanced_configuration.py

6. custom_strategies.py

7. export_import_kb.py

Quick Start

Strategy Selection Guide

Use Statistical Strategy (Default) When:

Use Embedding Strategy When:

Requirements

Learn More

1. `basic_usage.py`

2. `embedding_strategy.py` ⭐ NEW

3. `embedding_vs_statistical.py` ⭐ NEW

4. `embedding_configuration.py` ⭐ NEW

5. `advanced_configuration.py`

6. `custom_strategies.py`

7. `export_import_kb.py`