mirror of
https://github.com/unclecode/crawl4ai.git
synced 2026-06-10 15:58:15 +00:00
Commit Graph
Select branches
Hide Pull Requests
0.3.5
0.3.6
0.3.7
0.3.72
0.3.73
0.3.74
0.3.742
0.3.743
0.3.744
0.3.745
0.3.75
0.4.0
0.4.1
0.4.2
2025-JUN-1
add-claude-github-actions-1759553116682
bug/proxy_config
bugfix/arun-many-cdp-managed-browser
claude/fix-update-pyopenssl-security-011CUPexU25DkNvoxfu5ZrnB
claude/implement-webhook-crawl-feature-011CULZY1Jy8N5MUkZqXkRVp
coderabbitai/docstrings/14vTVzYa3bH06l5wYNY9jTghrrj9FxxWL
codex/add-httpx-and-https-http2]-packages
codex/add-memory_wait_timeout-parameter-to-memoryadaptivedispatche
codex/add-use_stemming-parameter-to-bm25contentfiler
codex/add-vnc-streaming-endpoint-to-docker-server
codex/find-and-fix-a-bug
codex/fix-indexerror-in-browser-manager-py-with-use-managed-browse
copilot/modify-page-creation-and-logging
deploy
develop
devin/1748137705-fix-bm25contentfilter-docs
docker-test
docker/add_features
docker/base_config_overrides
docker/fix_sig
docs
docs-llm-strategies-update
docs-proxy-security
extract-media
feat/ahmed_dev
feat/follow-frameset
feat/undetected-browser
feature/agent-oai
feature/async-llm-extaction
feature/c4a-script
feature/configHealthMonitor
feature/content-filter
feature/content-filter-nasrin-1
feature/docker-cluster
feature/docker-hooks
feature/docker-llm-parameters
feature/marketplace-sponsor-logo
feature/nasrin-cli-deep-crawl
feature/scraper
feature/scraping-strategy
feature/telemetry
fix-async-url-seeder-redirect-verification
fix-cors-disable-web-security
fix/adaptive-crawler-llm-config
fix/arun-return-type-1898
fix/async-llm-extraction-arunMany
fix/batch-easy-issues-10
fix/bedrock-provider-prefix
fix/case_senstive_params
fix/cdp
fix/configurable-backoff
fix/deep-crawl-scoring
fix/deep-crawl-scoring-priority
fix/deep-crawl-stream-docker
fix/deep-crawl-streaming-contextvar-1917
fix/deprecated_pydantic
fix/deserialize-schema-type-false-positive
fix/dfs_deep_crawling
fix/docker
fix/docker-filter
fix/docker-jwt
fix/docker-llmEnvFile
fix/exit_with_q
fix/https-reditrect
fix/issue-1748-screenshot-scroll-delay
fix/issue-1776-adaptive-external-filter
fix/json-infinity-serialization
fix/linkPreviewScoring
fix/marketplace
fix/mcp-crawler-config-passthrough
fix/mcp-ensure-ascii-cjk-encoding
fix/n-playwright-stealth
fix/nlp-sentence-chunking-1909
fix/playwright-stealth
fix/preserve-tail-text-1938
fix/proxy_deprecation
fix/rate-limiter-burst-and-headers-1095
fix/relative_url
fix/release-notes-demo-code
fix/request-crawl-stream
fix/sandbox-escape-allowlist-attrs
fix/serialize-proxy-config
fix/sitemap_seeder
fix/timeline-deadlock-shared-lock-1754
fix/viewport_in_managed_browser
format-inline-tags
hooks
image-description
image-filterizer
implement-webhook-crawl-feature-011CULZY1Jy8N5MUkZqXkRVp
integrate-verified-prs
main
main-0.3.7
main-1
main-75
main-img-captionify
main-v0.2.72
merge-pr971
new-release-0.0.2
new-release-0.0.2-no-spacy
next
next-2-batch-crawl
next-JUN
next-MAY
next-alpine-docker
next-browser-farm
patch/generate_schema
pdf_processing
proxy-support
pull-84
release/v0.7.0
release/v0.7.1
release/v0.7.2
release/v0.7.3
release/v0.7.4
release/v0.7.5
release/v0.7.6
release/v0.7.7
release/v0.7.8
release/v0.8.0
release/v0.8.5
release/v0.8.7
release/v0.8.8
release/v0.8.9
run-many-deep-crawling
scraper-uc
scrapper
sponsors/thor_data
ssh-server
staging
unclecode-patch-1
unclecode-patch-2
unclecode-patch-3
unclecode-patch-4
unclecode-patch-5
unclecode-patch-6
unclecode-patch-7
unclecode-patch-8
unclecode/issue157
unclecode/issue167
v0.2.74
v0.2.76
v0.4.24
v0.4.241
v0.4.242
v0.4.243
v0.5.5
vr0.4.244
vr0.4.245
vr0.4.246
vr0.4.267
vr0.4.3b1
vr0.4.3b2
vr0.4.3b3
vr0.5.0.post1
vr0.5.0.post5
#1004
#1030
#1054
#1058
#1059
#1060
#1062
#1065
#1068
#1073
#1074
#1077
#1078
#108
#1081
#1083
#1085
#1085
#109
#1090
#1093
#1094
#1098
#1100
#1102
#1104
#1106
#1107
#1108
#1110
#1113
#1122
#1123
#1124
#1124
#1133
#1137
#1140
#1145
#1152
#1155
#1155
#1156
#1157
#1159
#1161
#1170
#1175
#1179
#1180
#1184
#1186
#119
#1192
#1193
#1195
#1200
#1207
#1208
#1209
#1210
#1211
#1212
#1214
#1220
#1223
#1225
#1232
#1234
#1238
#1239
#1245
#1249
#125
#1255
#1263
#1265
#1266
#1267
#1272
#1274
#128
#1281
#1282
#1285
#1289
#1289
#129
#1290
#1296
#13
#1303
#1304
#1305
#1307
#1308
#1313
#1319
#1334
#1334
#1336
#1337
#1339
#134
#135
#1351
#1356
#1358
#1361
#1364
#1366
#1368
#1369
#1371
#1372
#1373
#1376
#1378
#1381
#1383
#1384
#1386
#1387
#1388
#1389
#139
#1390
#1393
#1395
#1398
#1399
#14
#1402
#1408
#1413
#1416
#1417
#1420
#1422
#1425
#1426
#1432
#1433
#1435
#1436
#1440
#1441
#1444
#1447
#1448
#1450
#1451
#1454
#1463
#1464
#1465
#1467
#1469
#1470
#1471
#1478
#1482
#1483
#1486
#1488
#149
#1494
#1495
#1496
#1497
#1501
#1508
#1513
#1514
#1518
#1519
#1525
#1527
#1528
#1529
#1530
#1531
#1532
#1533
#1533
#1535
#1536
#1537
#1539
#1546
#1547
#1548
#1550
#1554
#1555
#1556
#1557
#1558
#1560
#1565
#1568
#1569
#1570
#1572
#1576
#158
#1580
#1588
#1589
#1590
#1592
#1595
#1596
#1597
#1598
#1599
#1600
#1605
#1607
#1609
#1612
#1613
#1617
#1617
#1619
#1620
#1622
#1623
#1624
#1628
#1630
#1633
#1637
#1640
#1641
#1643
#1645
#1648
#1650
#1653
#1655
#1661
#1662
#1667
#1668
#1674
#1676
#1677
#1681
#1683
#1685
#1689
#169
#1694
#1696
#1697
#1698
#1700
#1702
#1703
#1706
#1707
#1710
#1712
#1713
#1714
#1715
#1716
#1717
#1718
#1719
#172
#1720
#1721
#1722
#1723
#1724
#1729
#1730
#1733
#1734
#1744
#1746
#1752
#1755
#1756
#1756
#1759
#176
#1760
#1761
#1763
#1764
#1765
#1766
#1768
#1770
#1771
#1772
#1773
#1774
#1775
#1777
#1778
#1782
#1783
#1784
#1785
#1786
#1787
#1788
#1789
#1790
#1791
#1792
#1793
#1794
#1795
#1796
#1798
#1803
#1804
#1805
#1806
#1807
#1807
#1808
#1808
#1809
#1809
#1810
#1810
#1811
#1811
#1812
#1812
#1813
#1814
#1814
#1816
#1816
#1822
#1822
#1823
#1824
#1826
#1827
#1828
#1829
#1830
#1831
#1832
#1833
#1834
#1835
#1835
#1836
#1838
#1838
#1840
#1840
#1844
#1845
#1846
#1847
#1847
#1849
#1851
#1852
#1853
#1853
#1854
#1854
#1855
#1856
#1856
#1857
#1857
#1858
#1858
#1859
#1859
#1860
#1860
#1861
#1861
#1862
#1862
#1866
#1866
#1868
#1868
#1869
#1869
#1870
#1870
#1871
#1871
#1873
#1873
#1874
#1874
#1875
#1875
#1876
#1876
#1877
#1879
#1881
#1881
#1882
#1884
#1884
#1885
#1886
#1887
#1887
#1891
#1891
#1892
#1892
#1893
#1893
#1895
#1895
#1896
#1896
#1897
#1899
#1899
#1901
#1902
#1902
#1904
#1904
#1906
#1906
#1907
#1908
#1908
#1910
#1911
#1913
#1914
#1915
#1915
#1922
#1923
#1923
#1925
#1929
#1931
#1932
#1932
#1933
#1934
#1935
#1935
#1936
#1937
#1939
#194
#1940
#1941
#1941
#1943
#1944
#1944
#1946
#1946
#1947
#1951
#1952
#1953
#1955
#1955
#1957
#1957
#1960
#1965
#1965
#1967
#1969
#1970
#1970
#1971
#1975
#1976
#1977
#1977
#1978
#1979
#1981
#1983
#1983
#1984
#1984
#1985
#1985
#1986
#1986
#1987
#1987
#1988
#1988
#1989
#1990
#1991
#1991
#1993
#1993
#1994
#1994
#1995
#1995
#1997
#1997
#200
#2001
#2001
#2003
#2003
#2004
#2004
#2005
#2005
#2008
#2008
#2009
#2009
#215
#218
#229
#232
#234
#24
#249
#255
#269
#271
#279
#286
#288
#293
#294
#298
#299
#3
#300
#304
#312
#313
#314
#324
#33
#332
#335
#337
#34
#357
#358
#369
#37
#379
#387
#389
#390
#394
#403
#410
#411
#416
#419
#419
#427
#440
#444
#445
#458
#462
#465
#472
#475
#496
#510
#562
#581
#60
#605
#606
#609
#612
#617
#618
#622
#64
#640
#65
#657
#658
#66
#662
#671
#679
#680
#681
#685
#687
#706
#708
#723
#724
#729
#734
#741
#749
#75
#752
#754
#775
#776
#777
#788
#792
#799
#80
#800
#806
#808
#821
#84
#84
#846
#85
#864
#865
#868
#891
#899
#901
#903
#914
#915
#916
#918
#929
#93
#931
#945
#948
#95
#961
#967
#969
#970
#971
#973
#977
#983
#988
#988
#990
#994
#999
0.3.4
checkpoint-pre-antibot-fallback
docker-rebuild-v0.7.5
docker-rebuild-v0.7.6
docker-rebuild-v0.7.7
docker-rebuild-v0.7.8
docker-rebuild-v0.8.0
docker-rebuild-v0.8.5
docker-rebuild-v0.8.6
docker-rebuild-v0.8.7
docker-rebuild-v0.8.8
docker-rebuild-v0.8.9
v.3.72
v0.0.75
v0.1.0
v0.2.0
v0.2.1
v0.2.2
v0.2.4
v0.2.6
v0.2.7
v0.2.71
v0.2.72
v0.2.73
v0.2.74
v0.2.77
v0.3.0
v0.3.3
v0.3.6
v0.3.745
v0.3.746
v0.4.24
v0.4.243
v0.5.0.post1
v0.6.3
v0.7.0
v0.7.1
v0.7.2
v0.7.3
v0.7.4
v0.7.5
v0.7.6
v0.7.7
v0.7.8
v0.8.0
v0.8.5
v0.8.6
v0.8.7
v0.8.8
v0.8.9
vr0.6.0
vr0.6.0rc1
vr0.6.3
Select branches
Hide Pull Requests
0.3.5
0.3.6
0.3.7
0.3.72
0.3.73
0.3.74
0.3.742
0.3.743
0.3.744
0.3.745
0.3.75
0.4.0
0.4.1
0.4.2
2025-JUN-1
add-claude-github-actions-1759553116682
bug/proxy_config
bugfix/arun-many-cdp-managed-browser
claude/fix-update-pyopenssl-security-011CUPexU25DkNvoxfu5ZrnB
claude/implement-webhook-crawl-feature-011CULZY1Jy8N5MUkZqXkRVp
coderabbitai/docstrings/14vTVzYa3bH06l5wYNY9jTghrrj9FxxWL
codex/add-httpx-and-https-http2]-packages
codex/add-memory_wait_timeout-parameter-to-memoryadaptivedispatche
codex/add-use_stemming-parameter-to-bm25contentfiler
codex/add-vnc-streaming-endpoint-to-docker-server
codex/find-and-fix-a-bug
codex/fix-indexerror-in-browser-manager-py-with-use-managed-browse
copilot/modify-page-creation-and-logging
deploy
develop
devin/1748137705-fix-bm25contentfilter-docs
docker-test
docker/add_features
docker/base_config_overrides
docker/fix_sig
docs
docs-llm-strategies-update
docs-proxy-security
extract-media
feat/ahmed_dev
feat/follow-frameset
feat/undetected-browser
feature/agent-oai
feature/async-llm-extaction
feature/c4a-script
feature/configHealthMonitor
feature/content-filter
feature/content-filter-nasrin-1
feature/docker-cluster
feature/docker-hooks
feature/docker-llm-parameters
feature/marketplace-sponsor-logo
feature/nasrin-cli-deep-crawl
feature/scraper
feature/scraping-strategy
feature/telemetry
fix-async-url-seeder-redirect-verification
fix-cors-disable-web-security
fix/adaptive-crawler-llm-config
fix/arun-return-type-1898
fix/async-llm-extraction-arunMany
fix/batch-easy-issues-10
fix/bedrock-provider-prefix
fix/case_senstive_params
fix/cdp
fix/configurable-backoff
fix/deep-crawl-scoring
fix/deep-crawl-scoring-priority
fix/deep-crawl-stream-docker
fix/deep-crawl-streaming-contextvar-1917
fix/deprecated_pydantic
fix/deserialize-schema-type-false-positive
fix/dfs_deep_crawling
fix/docker
fix/docker-filter
fix/docker-jwt
fix/docker-llmEnvFile
fix/exit_with_q
fix/https-reditrect
fix/issue-1748-screenshot-scroll-delay
fix/issue-1776-adaptive-external-filter
fix/json-infinity-serialization
fix/linkPreviewScoring
fix/marketplace
fix/mcp-crawler-config-passthrough
fix/mcp-ensure-ascii-cjk-encoding
fix/n-playwright-stealth
fix/nlp-sentence-chunking-1909
fix/playwright-stealth
fix/preserve-tail-text-1938
fix/proxy_deprecation
fix/rate-limiter-burst-and-headers-1095
fix/relative_url
fix/release-notes-demo-code
fix/request-crawl-stream
fix/sandbox-escape-allowlist-attrs
fix/serialize-proxy-config
fix/sitemap_seeder
fix/timeline-deadlock-shared-lock-1754
fix/viewport_in_managed_browser
format-inline-tags
hooks
image-description
image-filterizer
implement-webhook-crawl-feature-011CULZY1Jy8N5MUkZqXkRVp
integrate-verified-prs
main
main-0.3.7
main-1
main-75
main-img-captionify
main-v0.2.72
merge-pr971
new-release-0.0.2
new-release-0.0.2-no-spacy
next
next-2-batch-crawl
next-JUN
next-MAY
next-alpine-docker
next-browser-farm
patch/generate_schema
pdf_processing
proxy-support
pull-84
release/v0.7.0
release/v0.7.1
release/v0.7.2
release/v0.7.3
release/v0.7.4
release/v0.7.5
release/v0.7.6
release/v0.7.7
release/v0.7.8
release/v0.8.0
release/v0.8.5
release/v0.8.7
release/v0.8.8
release/v0.8.9
run-many-deep-crawling
scraper-uc
scrapper
sponsors/thor_data
ssh-server
staging
unclecode-patch-1
unclecode-patch-2
unclecode-patch-3
unclecode-patch-4
unclecode-patch-5
unclecode-patch-6
unclecode-patch-7
unclecode-patch-8
unclecode/issue157
unclecode/issue167
v0.2.74
v0.2.76
v0.4.24
v0.4.241
v0.4.242
v0.4.243
v0.5.5
vr0.4.244
vr0.4.245
vr0.4.246
vr0.4.267
vr0.4.3b1
vr0.4.3b2
vr0.4.3b3
vr0.5.0.post1
vr0.5.0.post5
#1004
#1030
#1054
#1058
#1059
#1060
#1062
#1065
#1068
#1073
#1074
#1077
#1078
#108
#1081
#1083
#1085
#1085
#109
#1090
#1093
#1094
#1098
#1100
#1102
#1104
#1106
#1107
#1108
#1110
#1113
#1122
#1123
#1124
#1124
#1133
#1137
#1140
#1145
#1152
#1155
#1155
#1156
#1157
#1159
#1161
#1170
#1175
#1179
#1180
#1184
#1186
#119
#1192
#1193
#1195
#1200
#1207
#1208
#1209
#1210
#1211
#1212
#1214
#1220
#1223
#1225
#1232
#1234
#1238
#1239
#1245
#1249
#125
#1255
#1263
#1265
#1266
#1267
#1272
#1274
#128
#1281
#1282
#1285
#1289
#1289
#129
#1290
#1296
#13
#1303
#1304
#1305
#1307
#1308
#1313
#1319
#1334
#1334
#1336
#1337
#1339
#134
#135
#1351
#1356
#1358
#1361
#1364
#1366
#1368
#1369
#1371
#1372
#1373
#1376
#1378
#1381
#1383
#1384
#1386
#1387
#1388
#1389
#139
#1390
#1393
#1395
#1398
#1399
#14
#1402
#1408
#1413
#1416
#1417
#1420
#1422
#1425
#1426
#1432
#1433
#1435
#1436
#1440
#1441
#1444
#1447
#1448
#1450
#1451
#1454
#1463
#1464
#1465
#1467
#1469
#1470
#1471
#1478
#1482
#1483
#1486
#1488
#149
#1494
#1495
#1496
#1497
#1501
#1508
#1513
#1514
#1518
#1519
#1525
#1527
#1528
#1529
#1530
#1531
#1532
#1533
#1533
#1535
#1536
#1537
#1539
#1546
#1547
#1548
#1550
#1554
#1555
#1556
#1557
#1558
#1560
#1565
#1568
#1569
#1570
#1572
#1576
#158
#1580
#1588
#1589
#1590
#1592
#1595
#1596
#1597
#1598
#1599
#1600
#1605
#1607
#1609
#1612
#1613
#1617
#1617
#1619
#1620
#1622
#1623
#1624
#1628
#1630
#1633
#1637
#1640
#1641
#1643
#1645
#1648
#1650
#1653
#1655
#1661
#1662
#1667
#1668
#1674
#1676
#1677
#1681
#1683
#1685
#1689
#169
#1694
#1696
#1697
#1698
#1700
#1702
#1703
#1706
#1707
#1710
#1712
#1713
#1714
#1715
#1716
#1717
#1718
#1719
#172
#1720
#1721
#1722
#1723
#1724
#1729
#1730
#1733
#1734
#1744
#1746
#1752
#1755
#1756
#1756
#1759
#176
#1760
#1761
#1763
#1764
#1765
#1766
#1768
#1770
#1771
#1772
#1773
#1774
#1775
#1777
#1778
#1782
#1783
#1784
#1785
#1786
#1787
#1788
#1789
#1790
#1791
#1792
#1793
#1794
#1795
#1796
#1798
#1803
#1804
#1805
#1806
#1807
#1807
#1808
#1808
#1809
#1809
#1810
#1810
#1811
#1811
#1812
#1812
#1813
#1814
#1814
#1816
#1816
#1822
#1822
#1823
#1824
#1826
#1827
#1828
#1829
#1830
#1831
#1832
#1833
#1834
#1835
#1835
#1836
#1838
#1838
#1840
#1840
#1844
#1845
#1846
#1847
#1847
#1849
#1851
#1852
#1853
#1853
#1854
#1854
#1855
#1856
#1856
#1857
#1857
#1858
#1858
#1859
#1859
#1860
#1860
#1861
#1861
#1862
#1862
#1866
#1866
#1868
#1868
#1869
#1869
#1870
#1870
#1871
#1871
#1873
#1873
#1874
#1874
#1875
#1875
#1876
#1876
#1877
#1879
#1881
#1881
#1882
#1884
#1884
#1885
#1886
#1887
#1887
#1891
#1891
#1892
#1892
#1893
#1893
#1895
#1895
#1896
#1896
#1897
#1899
#1899
#1901
#1902
#1902
#1904
#1904
#1906
#1906
#1907
#1908
#1908
#1910
#1911
#1913
#1914
#1915
#1915
#1922
#1923
#1923
#1925
#1929
#1931
#1932
#1932
#1933
#1934
#1935
#1935
#1936
#1937
#1939
#194
#1940
#1941
#1941
#1943
#1944
#1944
#1946
#1946
#1947
#1951
#1952
#1953
#1955
#1955
#1957
#1957
#1960
#1965
#1965
#1967
#1969
#1970
#1970
#1971
#1975
#1976
#1977
#1977
#1978
#1979
#1981
#1983
#1983
#1984
#1984
#1985
#1985
#1986
#1986
#1987
#1987
#1988
#1988
#1989
#1990
#1991
#1991
#1993
#1993
#1994
#1994
#1995
#1995
#1997
#1997
#200
#2001
#2001
#2003
#2003
#2004
#2004
#2005
#2005
#2008
#2008
#2009
#2009
#215
#218
#229
#232
#234
#24
#249
#255
#269
#271
#279
#286
#288
#293
#294
#298
#299
#3
#300
#304
#312
#313
#314
#324
#33
#332
#335
#337
#34
#357
#358
#369
#37
#379
#387
#389
#390
#394
#403
#410
#411
#416
#419
#419
#427
#440
#444
#445
#458
#462
#465
#472
#475
#496
#510
#562
#581
#60
#605
#606
#609
#612
#617
#618
#622
#64
#640
#65
#657
#658
#66
#662
#671
#679
#680
#681
#685
#687
#706
#708
#723
#724
#729
#734
#741
#749
#75
#752
#754
#775
#776
#777
#788
#792
#799
#80
#800
#806
#808
#821
#84
#84
#846
#85
#864
#865
#868
#891
#899
#901
#903
#914
#915
#916
#918
#929
#93
#931
#945
#948
#95
#961
#967
#969
#970
#971
#973
#977
#983
#988
#988
#990
#994
#999
0.3.4
checkpoint-pre-antibot-fallback
docker-rebuild-v0.7.5
docker-rebuild-v0.7.6
docker-rebuild-v0.7.7
docker-rebuild-v0.7.8
docker-rebuild-v0.8.0
docker-rebuild-v0.8.5
docker-rebuild-v0.8.6
docker-rebuild-v0.8.7
docker-rebuild-v0.8.8
docker-rebuild-v0.8.9
v.3.72
v0.0.75
v0.1.0
v0.2.0
v0.2.1
v0.2.2
v0.2.4
v0.2.6
v0.2.7
v0.2.71
v0.2.72
v0.2.73
v0.2.74
v0.2.77
v0.3.0
v0.3.3
v0.3.6
v0.3.745
v0.3.746
v0.4.24
v0.4.243
v0.5.0.post1
v0.6.3
v0.7.0
v0.7.1
v0.7.2
v0.7.3
v0.7.4
v0.7.5
v0.7.6
v0.7.7
v0.7.8
v0.8.0
v0.8.5
v0.8.6
v0.8.7
v0.8.8
v0.8.9
vr0.6.0
vr0.6.0rc1
vr0.6.3
-
84b311760f
Commit Message: Enhance Crawl4AI with CLI and documentation updates - Implemented Command-Line Interface (CLI) in
crawl4ai/cli.py- Added chunking strategies and their documentation inllm.txt
UncleCode
2024-12-21 14:26:56 +08:00 -
8fbc2e0463
Refactor deployment configuration and enhance browser debugging options
UncleCode
2024-12-20 20:35:28 +08:00 -
849765712f
Enhance Crawl4AI with new features and documentation
UncleCode
2024-12-19 21:02:29 +08:00 -
7a5f83b76f
fix: Added browser config and crawler run config from 0.4.22
Aravind Karnam
2024-12-18 10:33:09 +05:30 -
393bb911c0
Enhance crawler strategies with new features - ReImplemented JsonXPathExtractionStrategy for enhanced JSON data extraction. - Updated existing extraction strategies for better performance. - Improved handling of response status codes during crawls.
UncleCode
2024-12-17 22:40:10 +08:00 -
7c0fa269a6
Merge pull request #9 from aravindkarnam/main
aravind
2024-12-17 18:43:36 +05:30 -
4a5f1aebee
Bump version to 0.4.23
UncleCode
2024-12-16 18:53:11 +08:00 -
a11d9646e3
Enhance crawler features and improve documentation
UncleCode
2024-12-16 18:52:51 +08:00 -
ed7bc1909c
Bump version to 0.4.22
UncleCode
2024-12-15 19:49:38 +08:00 -
e9e5b5642d
Fix js_snipprt issue 0.4.21 bump to 0.4.22
UncleCode
2024-12-15 19:49:30 +08:00 -
7524aa7b5e
Feature: Add Markdown generation to CrawlerRunConfig
UncleCode
2024-12-13 21:51:38 +08:00 -
b1ac4fe023
Merge branch 'main' into ssh-server
ssh-server
Unclecode
2024-12-12 12:25:26 +00:00 -
a3c92141a1
Merge branch 'main' of https://github.com/unclecode/crawl4ai
Unclecode
2024-12-12 12:25:01 +00:00 -
3fd777dd6f
remove crawl endpoints
Unclecode
2024-12-12 12:24:13 +00:00 -
7af1d32ef6
Update README for version 0.4.2: Reflect new features and enhancements
0.4.2
UncleCode
2024-12-12 20:18:44 +08:00 -
399af801a1
Merge branch 'next'
UncleCode
2024-12-12 20:17:27 +08:00 -
4a72c5ea6e
Add release notes and documentation for version 0.4.2: Configurable Crawlers, Session Management, and Enhanced Screenshot/PDF features
UncleCode
2024-12-12 20:15:50 +08:00 -
20d6f5fdf4
Merge branch 'main' of https://github.com/unclecode/crawl4ai
UncleCode
2024-12-12 19:58:01 +08:00 -
3d69715dba
chore: Update .gitignore to include new files and directories
UncleCode
2024-12-12 19:57:59 +08:00 -
de1766d565
Bump version to 0.4.2
UncleCode
2024-12-12 19:35:30 +08:00 -
0982c639ae
Enhance AsyncWebCrawler and related configurations
UncleCode
2024-12-12 19:35:09 +08:00 -
5188b7a6a0
Add full-page screenshot and PDF export features - Introduced a new approach for capturing full-page screenshots by exporting them as PDFs first, enhancing reliability and performance. - Added documentation for the feature in
docs/examples/full_page_screenshot_and_pdf_export.md. - Refactoredperform_completion_with_backoffincrawl4ai/utils.pyto include necessary extra parameters. - Updatedquickstart_async.pyto utilize LLM extraction with refined arguments.
UncleCode
2024-12-10 20:59:31 +08:00 -
759164831d
Update async_webcrawler.py (#337)
lvzhengri
2024-12-10 20:56:52 +08:00 -
5431fa2d0c
Add PDF & screenshot functionality, new tutorial
UncleCode
2024-12-10 20:10:39 +08:00 -
e130fd8db9
Implement new async crawler features and stability updates
UncleCode
2024-12-10 17:55:29 +08:00 -
ded554d334
Fixed typo (#324)
Mohammed
2024-12-09 07:17:43 -05:00 -
aadbcb3481
fix: Improve image loading handling by adding timeout for wait_for_function in AsyncPlaywrightCrawlerStrategy
0.4.1
UncleCode
2024-12-09 20:06:29 +08:00 -
2d31915f0a
Commit Message: Enhance Async Crawler with storage state handling - Updated Async Crawler to support storage state management. - Added error handling for URL validation in Async Web Crawler. - Modified README logo and improved .gitignore entries. - Fixed issues in multiple files for better code robustness.
UncleCode
2024-12-09 20:04:59 +08:00 -
ba3e808802
fix: The extract method logs output only when self.verbose is set to True. (#314)
lu4nx
2024-12-09 17:19:26 +08:00 -
e3488da194
fixing Readmen tap (#313)
Olavo Henrique Marques Peixoto
2024-12-09 03:34:52 -03:00 -
d7200138a0
Merge branch 'main' of https://github.com/unclecode/crawl4ai
Unclecode
2024-12-08 12:06:53 +00:00 -
740214e021
Merge branch 'next'
UncleCode
2024-12-08 20:06:36 +08:00 -
c51e901f68
feat: Enhance AsyncPlaywrightCrawlerStrategy with text-only and light modes, dynamic viewport adjustment, and session management
UncleCode
2024-12-08 20:04:44 +08:00 -
8c611dcb4b
Refactored web scraping components
UncleCode
2024-12-05 22:33:47 +08:00 -
be37abe05a
Merge branch 'main' of https://github.com/unclecode/crawl4ai
Unclecode
2024-12-04 12:31:45 +00:00 -
90ba51b52f
fix(mkdocs): correct typo in Docker Deployment navigation entry
Unclecode
2024-12-04 12:31:41 +00:00 -
a45b8b1eb1
Merge issues with 0.4.0 is over
UncleCode
2024-12-04 20:29:25 +08:00 -
56f82f3e7f
Merge branch 'next'
UncleCode
2024-12-04 20:27:35 +08:00 -
486db3a771
Updated to version 0.4.0 with new features - Enhanced error handling in async crawler. - Added flexible options in Markdown generation. - Updated user agent settings for improved reliability. - Reflected changes in documentation and examples.
0.4.0
UncleCode
2024-12-04 20:26:39 +08:00 -
b02544bc0b
docs: update README and blog for version 0.4.0 release, highlighting new features and improvements
UncleCode
2024-12-03 21:28:52 +08:00 -
e9639ad189
refactor: improve error handling in DataProcessor and optimize data parsing logic
UncleCode
2024-12-03 19:44:38 +08:00 -
95a4f74d2a
fix: pass logger to WebScrapingStrategy and update score computation in PruningContentFilter
UncleCode
2024-12-02 20:37:28 +08:00 -
293f299c08
Add PruningContentFilter with unit tests and update documentation
unclecode
2024-12-01 19:17:33 +08:00 -
80d58ad24c
bump version to 0.3.747
UncleCode
2024-11-30 22:00:15 +08:00 -
3e83893b3f
Enhance User-Agent Handling
UncleCode
2024-11-30 18:13:12 +08:00 -
8c76a8c7dc
docs: add contributor entry for dvschuyl regarding AsyncPlaywrightCrawlerStrategy issue
v0.3.746
UncleCode
2024-11-29 21:14:49 +08:00 -
0780db55e1
fix: handle errors during image dimension updates in AsyncPlaywrightCrawlerStrategy
UncleCode
2024-11-29 21:12:19 +08:00 -
1ed7c15118
🩹 Page-evaluate navigation destroyed error (#304)
dvschuyl
2024-11-29 14:06:04 +01:00 -
569bdb6073
Merge branch 'next'
UncleCode
2024-11-29 20:54:28 +08:00 -
1def53b7fe
docs: update Raspberry Pi section to indicate upcoming support
UncleCode
2024-11-29 20:53:43 +08:00 -
f9c98a377d
Enhance Docker support and improve installation process - Added new Docker commands for platform-specific builds. - Updated README with comprehensive installation and setup instructions. - Introduced
post_installmethod in setup script for automation. - Refined migration processes with enhanced error logging. - Bump version to 0.3.746 and updated dependencies.
UncleCode
2024-11-29 20:52:51 +08:00 -
93bf3e8a1f
Refactor Dockerfile and clean up main.py - Enhanced Dockerfile for platform-specific installations - Added ARG for TARGETPLATFORM and BUILDPLATFORM - Improved GPU support conditional on TARGETPLATFORM - Removed static pages mounting in main.py - Streamlined code structure to improve maintainability
UncleCode
2024-11-29 20:08:09 +08:00 -
d202f3539b
Enhance installation and migration processes - Added a post-installation setup script for initialization. - Updated README with installation notes for Playwright setup. - Enhanced migration logging for better error visibility. - Added 'pydantic' to requirements. - Bumped version to 0.3.746.
UncleCode
2024-11-29 18:48:44 +08:00 -
12e73d4898
refactor: remove legacy build hooks and setup files, migrate to setup.cfg and pyproject.toml
UncleCode
2024-11-29 16:01:19 +08:00 -
449dd7cc0b
Migrating from the classic setup.py to a using PyProject approach.
unclecode
2024-11-29 14:45:04 +08:00 -
b0419edda6
Update README.md (#300)
UncleCode
2024-11-29 02:31:17 +08:00 -
c0e87abaee
fix: update package versions in requirements.txt for compatibility
0.3.745
UncleCode
2024-11-28 21:43:08 +08:00 -
c8485776fe
docs: update README to reflect latest version v0.3.745
v0.3.745
UncleCode
2024-11-28 20:04:16 +08:00 -
aa3e2d0fe6
Merge branch 'main' of https://github.com/unclecode/crawl4ai
UncleCode
2024-11-28 20:03:43 +08:00 -
98c64f9d5f
Merge branch 'next'
UncleCode
2024-11-28 20:03:11 +08:00 -
7d81c17cca
fix: improve handling of CRAWL4_AI_BASE_DIRECTORY environment variable in setup.py
UncleCode
2024-11-28 20:02:39 +08:00 -
652d396a81
chore: update version to 0.3.745
UncleCode
2024-11-28 20:00:29 +08:00 -
1d83c493af
Enhance setup process and update contributors list - Acknowledge contributor paulokuong for fixing RAWL4_AI_BASE_DIRECTORY issue - Refine base directory handling in
setup.py- Clarify Playwright installation instructions and improve error handling
UncleCode
2024-11-28 19:58:40 +08:00 -
cf35cbe59e
CRAWL4_AI_BASE_DIRECTORY should be Path object instead of string (#298)
Paulo Kuong
2024-11-28 06:46:36 -05:00 -
9221c08418
docs: fix link formatting for recent updates section in README
UncleCode
2024-11-28 19:33:36 +08:00 -
48d43c14b1
docs: fix link formatting for recent updates section in README
UncleCode
2024-11-28 19:33:02 +08:00 -
776efa74a4
docs: fix link formatting for recent updates section in README
UncleCode
2024-11-28 19:32:32 +08:00 -
b14e83f499
docs: fix link formatting for recent updates section in README
UncleCode
2024-11-28 19:31:09 +08:00 -
a9b6b65238
chore: update version to 0.3.744 and add publish.sh to .gitignore
0.3.744
UncleCode
2024-11-28 19:26:50 +08:00 -
a036b7f122
feat: implement create_box_message utility for formatted error messages and enhance error logging in AsyncWebCrawler
UncleCode
2024-11-28 19:24:07 +08:00 -
0bccf23db3
docs: update quickstart_async.py to enable example function calls for better demonstration
UncleCode
2024-11-28 18:19:42 +08:00 -
0cbd594512
Merge branch 'next' - Update README, and quickstart examples
UncleCode
2024-11-28 16:43:16 +08:00 -
efe93a5f57
docs: enhance README with development TODOs and refine mission statement for clarity
UncleCode
2024-11-28 16:41:11 +08:00 -
3fda66b85b
docs: refine README content for clarity and conciseness, improving descriptions and formatting
UncleCode
2024-11-28 16:36:24 +08:00 -
ddfb6707b4
docs: update README to reflect new branding and improve section headings for clarity
UncleCode
2024-11-28 16:34:08 +08:00 -
a69f7a9531
fix: correct typo in function documentation for clarity and accuracy
UncleCode
2024-11-28 16:31:41 +08:00 -
d583aa43ca
refactor: update cache handling in quickstart_async example to use CacheMode enum
UncleCode
2024-11-28 15:53:25 +08:00 -
3abb573142
docs: update README for version 0.3.743 with improved formatting and contributor acknowledgments
UncleCode
2024-11-28 13:07:59 +08:00 -
d556dada9f
docs: update README to keep details open for extraction capabilities, browser integration, input/output flexibility, utility & debugging, security & accessibility, community & documentation, and cutting-edge features
UncleCode
2024-11-28 13:07:33 +08:00 -
ce7d49484f
docs: update README for version 0.3.743 with new features, enhancements, and contributor acknowledgments
UncleCode
2024-11-28 13:06:46 +08:00 -
e4acd18429
docs: update README for version 0.3.743 with new features, enhancements, and contributor acknowledgments
UncleCode
2024-11-28 13:06:30 +08:00 -
c2d4784810
fix: resolve merge conflict in DefaultMarkdownGenerator affecting fit_markdown generation
0.3.743
UncleCode
2024-11-28 12:56:31 +08:00 -
76bea6c577
Merge branch 'main' into 0.3.743
UncleCode
2024-11-28 12:53:30 +08:00 -
3ff0b0b2c4
feat: update changelog for version 0.3.743 with new features, improvements, and contributor acknowledgments
UncleCode
2024-11-28 12:48:07 +08:00 -
a1c7dc17ce
Merge branch 'next' of https://github.com/unclecode/crawl4ai into next
UncleCode
2024-11-28 12:45:57 +08:00 -
24723b2f10
Enhance features and documentation - Updated version to 0.3.743 - Improved ManagedBrowser configuration with dynamic host/port - Implemented fast HTML formatting in web crawler - Enhanced markdown generation with a new generator class - Improved sanitization and utility functions - Added contributor details and pull request acknowledgments - Updated documentation for clearer usage scenarios - Adjusted tests to reflect class name changes
UncleCode
2024-11-28 12:45:05 +08:00 -
f998e9e949
Fix: handled the cases where markdown_with_citations, references_markdown, and filtered_html might not be defined. (#293)
Hamza Farhan
2024-11-27 16:20:54 +05:00 -
73661f7d1f
docs: enhance development installation instructions (#286)
zhounan
2024-11-27 15:04:20 +08:00 -
b5d4db07d1
Merge branch 'main' of https://github.com/unclecode/crawl4ai
UncleCode
2024-11-27 14:55:58 +08:00 -
c6a022132b
docs: update CONTRIBUTORS.md to acknowledge aadityakanjolia4 for fixing 'CustomHTML2Text' bug
UncleCode
2024-11-27 14:55:56 +08:00 -
2f5e0598bb
updated definition of can_process_url to include dept as an argument, as it's needed to skip filters for start_url
Aravind Karnam
2024-11-26 18:26:57 +05:30 -
ff731e4ea1
fixed the final scraper_quickstart.py example
Aravind Karnam
2024-11-26 17:08:32 +05:30 -
9530ded83a
fixed the final scraper_quickstart.py example
Aravind Karnam
2024-11-26 17:05:54 +05:30 -
155c756238
<Future pending> issue fix was incorrect. Reverting
Aravind Karnam
2024-11-26 17:04:04 +05:30 -
a888c91790
Fix "Future attached to a different loop" error by ensuring tasks are created in the correct event loop
Aravind Karnam
2024-11-26 14:05:02 +05:30 -
a98d51a62c
Remove the can_process_url check from _process_links since it's already being checked in process_url
Aravind Karnam
2024-11-26 11:11:49 +05:30 -
ee3001b1f7
fix: moved depth as a param to can_process_url and applying filter chain only when depth is not zero. This way filter chain is skipped but other validations are in place even for start URL
Aravind Karnam
2024-11-26 10:22:14 +05:30 -
b13fd71040
chore: 1. Expose process_external_links as a param 2. Removed a few unused imports 3. Removed URL normalisation for external links separately as that won't be necessary
Aravind Karnam
2024-11-26 10:07:11 +05:30 -
195c0ccf8a
chore: remove deprecated Docker Compose configurations for crawl4ai service
unclecode
2024-11-24 19:40:27 +08:00 -
b09a86c0c1
chore: remove deprecated Docker Compose configurations for crawl4ai service
unclecode
2024-11-24 19:40:10 +08:00