mirror of
https://github.com/unclecode/crawl4ai.git
synced 2026-06-11 00:08:01 +00:00
Commit Graph
Select branches
Hide Pull Requests
0.3.5
0.3.6
0.3.7
0.3.72
0.3.73
0.3.74
0.3.742
0.3.743
0.3.744
0.3.745
0.3.75
0.4.0
0.4.1
0.4.2
2025-JUN-1
add-claude-github-actions-1759553116682
bug/proxy_config
bugfix/arun-many-cdp-managed-browser
claude/fix-update-pyopenssl-security-011CUPexU25DkNvoxfu5ZrnB
claude/implement-webhook-crawl-feature-011CULZY1Jy8N5MUkZqXkRVp
coderabbitai/docstrings/14vTVzYa3bH06l5wYNY9jTghrrj9FxxWL
codex/add-httpx-and-https-http2]-packages
codex/add-memory_wait_timeout-parameter-to-memoryadaptivedispatche
codex/add-use_stemming-parameter-to-bm25contentfiler
codex/add-vnc-streaming-endpoint-to-docker-server
codex/find-and-fix-a-bug
codex/fix-indexerror-in-browser-manager-py-with-use-managed-browse
copilot/modify-page-creation-and-logging
deploy
develop
devin/1748137705-fix-bm25contentfilter-docs
docker-test
docker/add_features
docker/base_config_overrides
docker/fix_sig
docs
docs-llm-strategies-update
docs-proxy-security
extract-media
feat/ahmed_dev
feat/follow-frameset
feat/undetected-browser
feature/agent-oai
feature/async-llm-extaction
feature/c4a-script
feature/configHealthMonitor
feature/content-filter
feature/content-filter-nasrin-1
feature/docker-cluster
feature/docker-hooks
feature/docker-llm-parameters
feature/marketplace-sponsor-logo
feature/nasrin-cli-deep-crawl
feature/scraper
feature/scraping-strategy
feature/telemetry
fix-async-url-seeder-redirect-verification
fix-cors-disable-web-security
fix/adaptive-crawler-llm-config
fix/arun-return-type-1898
fix/async-llm-extraction-arunMany
fix/batch-easy-issues-10
fix/bedrock-provider-prefix
fix/case_senstive_params
fix/cdp
fix/configurable-backoff
fix/deep-crawl-scoring
fix/deep-crawl-scoring-priority
fix/deep-crawl-stream-docker
fix/deep-crawl-streaming-contextvar-1917
fix/deprecated_pydantic
fix/deserialize-schema-type-false-positive
fix/dfs_deep_crawling
fix/docker
fix/docker-filter
fix/docker-jwt
fix/docker-llmEnvFile
fix/exit_with_q
fix/https-reditrect
fix/issue-1748-screenshot-scroll-delay
fix/issue-1776-adaptive-external-filter
fix/json-infinity-serialization
fix/linkPreviewScoring
fix/marketplace
fix/mcp-crawler-config-passthrough
fix/mcp-ensure-ascii-cjk-encoding
fix/n-playwright-stealth
fix/nlp-sentence-chunking-1909
fix/playwright-stealth
fix/preserve-tail-text-1938
fix/proxy_deprecation
fix/rate-limiter-burst-and-headers-1095
fix/relative_url
fix/release-notes-demo-code
fix/request-crawl-stream
fix/sandbox-escape-allowlist-attrs
fix/serialize-proxy-config
fix/sitemap_seeder
fix/timeline-deadlock-shared-lock-1754
fix/viewport_in_managed_browser
format-inline-tags
hooks
image-description
image-filterizer
implement-webhook-crawl-feature-011CULZY1Jy8N5MUkZqXkRVp
integrate-verified-prs
main
main-0.3.7
main-1
main-75
main-img-captionify
main-v0.2.72
merge-pr971
new-release-0.0.2
new-release-0.0.2-no-spacy
next
next-2-batch-crawl
next-JUN
next-MAY
next-alpine-docker
next-browser-farm
patch/generate_schema
pdf_processing
proxy-support
pull-84
release/v0.7.0
release/v0.7.1
release/v0.7.2
release/v0.7.3
release/v0.7.4
release/v0.7.5
release/v0.7.6
release/v0.7.7
release/v0.7.8
release/v0.8.0
release/v0.8.5
release/v0.8.7
release/v0.8.8
release/v0.8.9
run-many-deep-crawling
scraper-uc
scrapper
sponsors/thor_data
ssh-server
staging
unclecode-patch-1
unclecode-patch-2
unclecode-patch-3
unclecode-patch-4
unclecode-patch-5
unclecode-patch-6
unclecode-patch-7
unclecode-patch-8
unclecode/issue157
unclecode/issue167
v0.2.74
v0.2.76
v0.4.24
v0.4.241
v0.4.242
v0.4.243
v0.5.5
vr0.4.244
vr0.4.245
vr0.4.246
vr0.4.267
vr0.4.3b1
vr0.4.3b2
vr0.4.3b3
vr0.5.0.post1
vr0.5.0.post5
#1004
#1030
#1054
#1058
#1059
#1060
#1062
#1065
#1068
#1073
#1074
#1077
#1078
#108
#1081
#1083
#1085
#1085
#109
#1090
#1093
#1094
#1098
#1100
#1102
#1104
#1106
#1107
#1108
#1110
#1113
#1122
#1123
#1124
#1124
#1133
#1137
#1140
#1145
#1152
#1155
#1155
#1156
#1157
#1159
#1161
#1170
#1175
#1179
#1180
#1184
#1186
#119
#1192
#1193
#1195
#1200
#1207
#1208
#1209
#1210
#1211
#1212
#1214
#1220
#1223
#1225
#1232
#1234
#1238
#1239
#1245
#1249
#125
#1255
#1263
#1265
#1266
#1267
#1272
#1274
#128
#1281
#1282
#1285
#1289
#1289
#129
#1290
#1296
#13
#1303
#1304
#1305
#1307
#1308
#1313
#1319
#1334
#1334
#1336
#1337
#1339
#134
#135
#1351
#1356
#1358
#1361
#1364
#1366
#1368
#1369
#1371
#1372
#1373
#1376
#1378
#1381
#1383
#1384
#1386
#1387
#1388
#1389
#139
#1390
#1393
#1395
#1398
#1399
#14
#1402
#1408
#1413
#1416
#1417
#1420
#1422
#1425
#1426
#1432
#1433
#1435
#1436
#1440
#1441
#1444
#1447
#1448
#1450
#1451
#1454
#1463
#1464
#1465
#1467
#1469
#1470
#1471
#1478
#1482
#1483
#1486
#1488
#149
#1494
#1495
#1496
#1497
#1501
#1508
#1513
#1514
#1518
#1519
#1525
#1527
#1528
#1529
#1530
#1531
#1532
#1533
#1533
#1535
#1536
#1537
#1539
#1546
#1547
#1548
#1550
#1554
#1555
#1556
#1557
#1558
#1560
#1565
#1568
#1569
#1570
#1572
#1576
#158
#1580
#1588
#1589
#1590
#1592
#1595
#1596
#1597
#1598
#1599
#1600
#1605
#1607
#1609
#1612
#1613
#1617
#1617
#1619
#1620
#1622
#1623
#1624
#1628
#1630
#1633
#1637
#1640
#1641
#1643
#1645
#1648
#1650
#1653
#1655
#1661
#1662
#1667
#1668
#1674
#1676
#1677
#1681
#1683
#1685
#1689
#169
#1694
#1696
#1697
#1698
#1700
#1702
#1703
#1706
#1707
#1710
#1712
#1713
#1714
#1715
#1716
#1717
#1718
#1719
#172
#1720
#1721
#1722
#1723
#1724
#1729
#1730
#1733
#1734
#1744
#1746
#1752
#1755
#1756
#1756
#1759
#176
#1760
#1761
#1763
#1764
#1765
#1766
#1768
#1770
#1771
#1772
#1773
#1774
#1775
#1777
#1778
#1782
#1783
#1784
#1785
#1786
#1787
#1788
#1789
#1790
#1791
#1792
#1793
#1794
#1795
#1796
#1798
#1803
#1804
#1805
#1806
#1807
#1807
#1808
#1808
#1809
#1809
#1810
#1810
#1811
#1811
#1812
#1812
#1813
#1814
#1814
#1816
#1816
#1822
#1822
#1823
#1824
#1826
#1827
#1828
#1829
#1830
#1831
#1832
#1833
#1834
#1835
#1835
#1836
#1838
#1838
#1840
#1840
#1844
#1845
#1846
#1847
#1847
#1849
#1851
#1852
#1853
#1853
#1854
#1854
#1855
#1856
#1856
#1857
#1857
#1858
#1858
#1859
#1859
#1860
#1860
#1861
#1861
#1862
#1862
#1866
#1866
#1868
#1868
#1869
#1869
#1870
#1870
#1871
#1871
#1873
#1873
#1874
#1874
#1875
#1875
#1876
#1876
#1877
#1879
#1881
#1881
#1882
#1884
#1884
#1885
#1886
#1887
#1887
#1891
#1891
#1892
#1892
#1893
#1893
#1895
#1895
#1896
#1896
#1897
#1899
#1899
#1901
#1902
#1902
#1904
#1904
#1906
#1906
#1907
#1908
#1908
#1910
#1911
#1913
#1914
#1915
#1915
#1922
#1923
#1923
#1925
#1929
#1931
#1932
#1932
#1933
#1934
#1935
#1935
#1936
#1937
#1939
#194
#1940
#1941
#1941
#1943
#1944
#1944
#1946
#1946
#1947
#1951
#1952
#1953
#1955
#1955
#1957
#1957
#1960
#1965
#1965
#1967
#1969
#1970
#1970
#1971
#1975
#1976
#1977
#1977
#1978
#1979
#1981
#1983
#1983
#1984
#1984
#1985
#1985
#1986
#1986
#1987
#1987
#1988
#1988
#1989
#1990
#1991
#1991
#1993
#1993
#1994
#1994
#1995
#1995
#1997
#1997
#200
#2001
#2001
#2003
#2003
#2004
#2004
#2005
#2005
#2008
#2008
#2009
#2009
#215
#218
#229
#232
#234
#24
#249
#255
#269
#271
#279
#286
#288
#293
#294
#298
#299
#3
#300
#304
#312
#313
#314
#324
#33
#332
#335
#337
#34
#357
#358
#369
#37
#379
#387
#389
#390
#394
#403
#410
#411
#416
#419
#419
#427
#440
#444
#445
#458
#462
#465
#472
#475
#496
#510
#562
#581
#60
#605
#606
#609
#612
#617
#618
#622
#64
#640
#65
#657
#658
#66
#662
#671
#679
#680
#681
#685
#687
#706
#708
#723
#724
#729
#734
#741
#749
#75
#752
#754
#775
#776
#777
#788
#792
#799
#80
#800
#806
#808
#821
#84
#84
#846
#85
#864
#865
#868
#891
#899
#901
#903
#914
#915
#916
#918
#929
#93
#931
#945
#948
#95
#961
#967
#969
#970
#971
#973
#977
#983
#988
#988
#990
#994
#999
0.3.4
checkpoint-pre-antibot-fallback
docker-rebuild-v0.7.5
docker-rebuild-v0.7.6
docker-rebuild-v0.7.7
docker-rebuild-v0.7.8
docker-rebuild-v0.8.0
docker-rebuild-v0.8.5
docker-rebuild-v0.8.6
docker-rebuild-v0.8.7
docker-rebuild-v0.8.8
docker-rebuild-v0.8.9
v.3.72
v0.0.75
v0.1.0
v0.2.0
v0.2.1
v0.2.2
v0.2.4
v0.2.6
v0.2.7
v0.2.71
v0.2.72
v0.2.73
v0.2.74
v0.2.77
v0.3.0
v0.3.3
v0.3.6
v0.3.745
v0.3.746
v0.4.24
v0.4.243
v0.5.0.post1
v0.6.3
v0.7.0
v0.7.1
v0.7.2
v0.7.3
v0.7.4
v0.7.5
v0.7.6
v0.7.7
v0.7.8
v0.8.0
v0.8.5
v0.8.6
v0.8.7
v0.8.8
v0.8.9
vr0.6.0
vr0.6.0rc1
vr0.6.3
Select branches
Hide Pull Requests
0.3.5
0.3.6
0.3.7
0.3.72
0.3.73
0.3.74
0.3.742
0.3.743
0.3.744
0.3.745
0.3.75
0.4.0
0.4.1
0.4.2
2025-JUN-1
add-claude-github-actions-1759553116682
bug/proxy_config
bugfix/arun-many-cdp-managed-browser
claude/fix-update-pyopenssl-security-011CUPexU25DkNvoxfu5ZrnB
claude/implement-webhook-crawl-feature-011CULZY1Jy8N5MUkZqXkRVp
coderabbitai/docstrings/14vTVzYa3bH06l5wYNY9jTghrrj9FxxWL
codex/add-httpx-and-https-http2]-packages
codex/add-memory_wait_timeout-parameter-to-memoryadaptivedispatche
codex/add-use_stemming-parameter-to-bm25contentfiler
codex/add-vnc-streaming-endpoint-to-docker-server
codex/find-and-fix-a-bug
codex/fix-indexerror-in-browser-manager-py-with-use-managed-browse
copilot/modify-page-creation-and-logging
deploy
develop
devin/1748137705-fix-bm25contentfilter-docs
docker-test
docker/add_features
docker/base_config_overrides
docker/fix_sig
docs
docs-llm-strategies-update
docs-proxy-security
extract-media
feat/ahmed_dev
feat/follow-frameset
feat/undetected-browser
feature/agent-oai
feature/async-llm-extaction
feature/c4a-script
feature/configHealthMonitor
feature/content-filter
feature/content-filter-nasrin-1
feature/docker-cluster
feature/docker-hooks
feature/docker-llm-parameters
feature/marketplace-sponsor-logo
feature/nasrin-cli-deep-crawl
feature/scraper
feature/scraping-strategy
feature/telemetry
fix-async-url-seeder-redirect-verification
fix-cors-disable-web-security
fix/adaptive-crawler-llm-config
fix/arun-return-type-1898
fix/async-llm-extraction-arunMany
fix/batch-easy-issues-10
fix/bedrock-provider-prefix
fix/case_senstive_params
fix/cdp
fix/configurable-backoff
fix/deep-crawl-scoring
fix/deep-crawl-scoring-priority
fix/deep-crawl-stream-docker
fix/deep-crawl-streaming-contextvar-1917
fix/deprecated_pydantic
fix/deserialize-schema-type-false-positive
fix/dfs_deep_crawling
fix/docker
fix/docker-filter
fix/docker-jwt
fix/docker-llmEnvFile
fix/exit_with_q
fix/https-reditrect
fix/issue-1748-screenshot-scroll-delay
fix/issue-1776-adaptive-external-filter
fix/json-infinity-serialization
fix/linkPreviewScoring
fix/marketplace
fix/mcp-crawler-config-passthrough
fix/mcp-ensure-ascii-cjk-encoding
fix/n-playwright-stealth
fix/nlp-sentence-chunking-1909
fix/playwright-stealth
fix/preserve-tail-text-1938
fix/proxy_deprecation
fix/rate-limiter-burst-and-headers-1095
fix/relative_url
fix/release-notes-demo-code
fix/request-crawl-stream
fix/sandbox-escape-allowlist-attrs
fix/serialize-proxy-config
fix/sitemap_seeder
fix/timeline-deadlock-shared-lock-1754
fix/viewport_in_managed_browser
format-inline-tags
hooks
image-description
image-filterizer
implement-webhook-crawl-feature-011CULZY1Jy8N5MUkZqXkRVp
integrate-verified-prs
main
main-0.3.7
main-1
main-75
main-img-captionify
main-v0.2.72
merge-pr971
new-release-0.0.2
new-release-0.0.2-no-spacy
next
next-2-batch-crawl
next-JUN
next-MAY
next-alpine-docker
next-browser-farm
patch/generate_schema
pdf_processing
proxy-support
pull-84
release/v0.7.0
release/v0.7.1
release/v0.7.2
release/v0.7.3
release/v0.7.4
release/v0.7.5
release/v0.7.6
release/v0.7.7
release/v0.7.8
release/v0.8.0
release/v0.8.5
release/v0.8.7
release/v0.8.8
release/v0.8.9
run-many-deep-crawling
scraper-uc
scrapper
sponsors/thor_data
ssh-server
staging
unclecode-patch-1
unclecode-patch-2
unclecode-patch-3
unclecode-patch-4
unclecode-patch-5
unclecode-patch-6
unclecode-patch-7
unclecode-patch-8
unclecode/issue157
unclecode/issue167
v0.2.74
v0.2.76
v0.4.24
v0.4.241
v0.4.242
v0.4.243
v0.5.5
vr0.4.244
vr0.4.245
vr0.4.246
vr0.4.267
vr0.4.3b1
vr0.4.3b2
vr0.4.3b3
vr0.5.0.post1
vr0.5.0.post5
#1004
#1030
#1054
#1058
#1059
#1060
#1062
#1065
#1068
#1073
#1074
#1077
#1078
#108
#1081
#1083
#1085
#1085
#109
#1090
#1093
#1094
#1098
#1100
#1102
#1104
#1106
#1107
#1108
#1110
#1113
#1122
#1123
#1124
#1124
#1133
#1137
#1140
#1145
#1152
#1155
#1155
#1156
#1157
#1159
#1161
#1170
#1175
#1179
#1180
#1184
#1186
#119
#1192
#1193
#1195
#1200
#1207
#1208
#1209
#1210
#1211
#1212
#1214
#1220
#1223
#1225
#1232
#1234
#1238
#1239
#1245
#1249
#125
#1255
#1263
#1265
#1266
#1267
#1272
#1274
#128
#1281
#1282
#1285
#1289
#1289
#129
#1290
#1296
#13
#1303
#1304
#1305
#1307
#1308
#1313
#1319
#1334
#1334
#1336
#1337
#1339
#134
#135
#1351
#1356
#1358
#1361
#1364
#1366
#1368
#1369
#1371
#1372
#1373
#1376
#1378
#1381
#1383
#1384
#1386
#1387
#1388
#1389
#139
#1390
#1393
#1395
#1398
#1399
#14
#1402
#1408
#1413
#1416
#1417
#1420
#1422
#1425
#1426
#1432
#1433
#1435
#1436
#1440
#1441
#1444
#1447
#1448
#1450
#1451
#1454
#1463
#1464
#1465
#1467
#1469
#1470
#1471
#1478
#1482
#1483
#1486
#1488
#149
#1494
#1495
#1496
#1497
#1501
#1508
#1513
#1514
#1518
#1519
#1525
#1527
#1528
#1529
#1530
#1531
#1532
#1533
#1533
#1535
#1536
#1537
#1539
#1546
#1547
#1548
#1550
#1554
#1555
#1556
#1557
#1558
#1560
#1565
#1568
#1569
#1570
#1572
#1576
#158
#1580
#1588
#1589
#1590
#1592
#1595
#1596
#1597
#1598
#1599
#1600
#1605
#1607
#1609
#1612
#1613
#1617
#1617
#1619
#1620
#1622
#1623
#1624
#1628
#1630
#1633
#1637
#1640
#1641
#1643
#1645
#1648
#1650
#1653
#1655
#1661
#1662
#1667
#1668
#1674
#1676
#1677
#1681
#1683
#1685
#1689
#169
#1694
#1696
#1697
#1698
#1700
#1702
#1703
#1706
#1707
#1710
#1712
#1713
#1714
#1715
#1716
#1717
#1718
#1719
#172
#1720
#1721
#1722
#1723
#1724
#1729
#1730
#1733
#1734
#1744
#1746
#1752
#1755
#1756
#1756
#1759
#176
#1760
#1761
#1763
#1764
#1765
#1766
#1768
#1770
#1771
#1772
#1773
#1774
#1775
#1777
#1778
#1782
#1783
#1784
#1785
#1786
#1787
#1788
#1789
#1790
#1791
#1792
#1793
#1794
#1795
#1796
#1798
#1803
#1804
#1805
#1806
#1807
#1807
#1808
#1808
#1809
#1809
#1810
#1810
#1811
#1811
#1812
#1812
#1813
#1814
#1814
#1816
#1816
#1822
#1822
#1823
#1824
#1826
#1827
#1828
#1829
#1830
#1831
#1832
#1833
#1834
#1835
#1835
#1836
#1838
#1838
#1840
#1840
#1844
#1845
#1846
#1847
#1847
#1849
#1851
#1852
#1853
#1853
#1854
#1854
#1855
#1856
#1856
#1857
#1857
#1858
#1858
#1859
#1859
#1860
#1860
#1861
#1861
#1862
#1862
#1866
#1866
#1868
#1868
#1869
#1869
#1870
#1870
#1871
#1871
#1873
#1873
#1874
#1874
#1875
#1875
#1876
#1876
#1877
#1879
#1881
#1881
#1882
#1884
#1884
#1885
#1886
#1887
#1887
#1891
#1891
#1892
#1892
#1893
#1893
#1895
#1895
#1896
#1896
#1897
#1899
#1899
#1901
#1902
#1902
#1904
#1904
#1906
#1906
#1907
#1908
#1908
#1910
#1911
#1913
#1914
#1915
#1915
#1922
#1923
#1923
#1925
#1929
#1931
#1932
#1932
#1933
#1934
#1935
#1935
#1936
#1937
#1939
#194
#1940
#1941
#1941
#1943
#1944
#1944
#1946
#1946
#1947
#1951
#1952
#1953
#1955
#1955
#1957
#1957
#1960
#1965
#1965
#1967
#1969
#1970
#1970
#1971
#1975
#1976
#1977
#1977
#1978
#1979
#1981
#1983
#1983
#1984
#1984
#1985
#1985
#1986
#1986
#1987
#1987
#1988
#1988
#1989
#1990
#1991
#1991
#1993
#1993
#1994
#1994
#1995
#1995
#1997
#1997
#200
#2001
#2001
#2003
#2003
#2004
#2004
#2005
#2005
#2008
#2008
#2009
#2009
#215
#218
#229
#232
#234
#24
#249
#255
#269
#271
#279
#286
#288
#293
#294
#298
#299
#3
#300
#304
#312
#313
#314
#324
#33
#332
#335
#337
#34
#357
#358
#369
#37
#379
#387
#389
#390
#394
#403
#410
#411
#416
#419
#419
#427
#440
#444
#445
#458
#462
#465
#472
#475
#496
#510
#562
#581
#60
#605
#606
#609
#612
#617
#618
#622
#64
#640
#65
#657
#658
#66
#662
#671
#679
#680
#681
#685
#687
#706
#708
#723
#724
#729
#734
#741
#749
#75
#752
#754
#775
#776
#777
#788
#792
#799
#80
#800
#806
#808
#821
#84
#84
#846
#85
#864
#865
#868
#891
#899
#901
#903
#914
#915
#916
#918
#929
#93
#931
#945
#948
#95
#961
#967
#969
#970
#971
#973
#977
#983
#988
#988
#990
#994
#999
0.3.4
checkpoint-pre-antibot-fallback
docker-rebuild-v0.7.5
docker-rebuild-v0.7.6
docker-rebuild-v0.7.7
docker-rebuild-v0.7.8
docker-rebuild-v0.8.0
docker-rebuild-v0.8.5
docker-rebuild-v0.8.6
docker-rebuild-v0.8.7
docker-rebuild-v0.8.8
docker-rebuild-v0.8.9
v.3.72
v0.0.75
v0.1.0
v0.2.0
v0.2.1
v0.2.2
v0.2.4
v0.2.6
v0.2.7
v0.2.71
v0.2.72
v0.2.73
v0.2.74
v0.2.77
v0.3.0
v0.3.3
v0.3.6
v0.3.745
v0.3.746
v0.4.24
v0.4.243
v0.5.0.post1
v0.6.3
v0.7.0
v0.7.1
v0.7.2
v0.7.3
v0.7.4
v0.7.5
v0.7.6
v0.7.7
v0.7.8
v0.8.0
v0.8.5
v0.8.6
v0.8.7
v0.8.8
v0.8.9
vr0.6.0
vr0.6.0rc1
vr0.6.3
-
36e46be23d
chore: Add verbose option to ExtractionStrategy classes
unclecode
2024-05-17 18:06:10 +08:00 -
32c87f0388
chore: Update NlpSentenceChunking constructor parameters to None
unclecode
2024-05-17 17:00:43 +08:00 -
647cfda225
chore: Update Crawl4AI quickstart script in README.md
unclecode
2024-05-17 16:55:34 +08:00 -
1cc67df301
chore: Update pip installation command and requirements, add new dependencies
unclecode
2024-05-17 16:53:03 +08:00 -
d7b37e849d
chore: Update CrawlRequest model to use NoExtractionStrategy as default
unclecode
2024-05-17 16:50:38 +08:00 -
f52f526002
chore: Update web_crawler.py to use NoExtractionStrategy as default
unclecode
2024-05-17 16:03:35 +08:00 -
3593f017d7
chore: Update setup.py to exclude torch, transformers, and nltk dependencies
unclecode
2024-05-17 16:01:04 +08:00 -
e7bb76f19b
chore: Update torch dependency to version 2.3.0
unclecode
2024-05-17 15:52:39 +08:00 -
593b928967
Update requirements.txt to include latest versions of dependencies
unclecode
2024-05-17 15:48:14 +08:00 -
bb3d37face
chore: Update requirements.txt to include latest versions of dependencies
unclecode
2024-05-17 15:32:37 +08:00 -
3f8576f870
chore: Update model_loader.py to use pretrained models without resume_download
unclecode
2024-05-17 15:26:15 +08:00 -
bf3b040f10
chore: Update pip installation command and requirements, add new dependencies
unclecode
2024-05-17 15:21:45 +08:00 -
a317dc5e1d
Load CosineStrategy in the function
unclecode
2024-05-17 15:13:06 +08:00 -
a5f9d07dbf
Remove dependency on Spacy model.
unclecode
2024-05-17 15:08:03 +08:00 -
f85df91ca6
chore: Update README.md with Colab badge
new-release-0.0.2
unclecode
2024-05-17 00:21:16 +08:00 -
6fcaf26b4f
Update quickstart.py: Add counting items
UncleCode
2024-05-16 22:49:12 +08:00 -
5b4a586b2d
Update web_crawler.py
UncleCode
2024-05-16 22:28:24 +08:00 -
a856319499
Update web_crawler.py
UncleCode
2024-05-16 22:06:33 +08:00 -
5ce1dc1622
Update web_crawler.py
UncleCode
2024-05-16 21:58:11 +08:00 -
ea16dec587
Improve library loading
unclecode
2024-05-16 21:19:02 +08:00 -
d19488a821
chore: Update model_loader.py to create necessary folders in the home directory
unclecode
2024-05-16 21:05:24 +08:00 -
199c66114c
chore: Update pip installation command and requirements, add new dependencies
unclecode
2024-05-16 20:58:36 +08:00 -
45569d058d
chore: Update pip installation command and requirements for Crawl4AI
unclecode
2024-05-16 20:42:53 +08:00 -
5bb0b0b378
chore: Update pip installation command and requirements for Crawl4AI
unclecode
2024-05-16 20:36:29 +08:00 -
4006f5f4e2
chore: Update pip installation command to use sys.executable
unclecode
2024-05-16 20:24:48 +08:00 -
7e0682e0de
chore: Update dependencies and installation process
unclecode
2024-05-16 20:22:50 +08:00 -
8e28eb9efb
Add model loader, update requirements.txt
unclecode
2024-05-16 20:08:21 +08:00 -
c8589f8da3
Update: - Fix Spacy model issue - Update Readme and requirements.txt
unclecode
2024-05-16 19:50:20 +08:00 -
6a6365ae0a
Refactor code to exclude the extraction of semantical blocks of text from the HTML
unclecode
2024-05-16 18:10:55 +08:00 -
5b80be956d
Update: - Debug - Refactor code for new version
unclecode
2024-05-16 17:31:44 +08:00 -
4a2e17447b
Update README.md
UncleCode
2024-05-16 08:57:58 +08:00 -
f6e59157bf
- Test all methods - Update index.hml - Update Readme - Resolve some bugs
unclecode
2024-05-14 21:27:41 +08:00 -
5fea6c064b
Improve libraries import
unclecode
2024-05-13 02:46:35 +08:00 -
11393183f7
Add Colab setup scritp.
unclecode
2024-05-13 00:39:06 +08:00 -
7679064521
Add model parameter for clustring.
unclecode
2024-05-13 00:06:16 +08:00 -
cf087cfa58
Replace embedding model with smaller one
unclecode
2024-05-12 23:55:57 +08:00 -
5693e324a4
Add time measurements.
unclecode
2024-05-12 23:35:27 +08:00 -
b38bf64490
Exclude spaCy from requirements.txt
unclecode
2024-05-12 22:59:26 +08:00 -
82706129f5
Update: - Text Categorization - Crawler, Extraction, and Chunking strategies - Clustering for semantic segmentation
unclecode
2024-05-12 22:37:21 +08:00 -
7039e3c1ee
- Issue Resolved: Every
<pre>tag's HTML content is replaced with its inner text to address situations like syntax highlighters, where each character might be in a<span>. This avoids issues where the minimum word threshold might ignore them.
unclecode
2024-05-12 14:08:22 +08:00 -
8e536b9717
chore: Refactor README.md and project structure
unclecode
2024-05-12 12:41:42 +08:00 -
aac4e07389
chore: Update README.md and project structure
unclecode
2024-05-12 12:39:31 +08:00 -
e3960ace68
Update README.md
UncleCode
2024-05-11 22:11:16 +08:00 -
b0f97ab2b3
Update README.md
UncleCode
2024-05-11 08:56:19 +08:00 -
372c921429
Update: Fix bug, when user set extract_blocks to False
unclecode
2024-05-10 20:12:31 +08:00 -
aa126e436b
Add CORS middleware for allowing all origins to make requests
ntohidi
2024-05-10 12:27:40 +02:00 -
20ef255c7f
Update README
unclecode
2024-05-09 23:28:47 +08:00 -
da7748a780
Update README file
unclecode
2024-05-09 22:51:10 +08:00 -
f74f4e88c0
Update README file
unclecode
2024-05-09 22:48:42 +08:00 -
a8e7218769
chore: Update README.md and project structure
unclecode
2024-05-09 22:40:08 +08:00 -
84f093593a
Update README
unclecode
2024-05-09 22:37:45 +08:00 -
88643612e8
chore: Update environment variable usage in config files
unclecode
2024-05-09 22:37:01 +08:00 -
6f99bad6f0
Update web application URL in README.md
unclecode
2024-05-09 22:28:37 +08:00 -
50d7a7e45d
chore: Update forced flag for single page fetch to use default value
unclecode
2024-05-09 22:21:12 +08:00 -
c71dd9189b
chore: Update import statements to use crawl4ai package
unclecode
2024-05-09 22:17:15 +08:00 -
3ff1d15702
Change the project folder name from crawler to crawl4ai
unclecode
2024-05-09 22:16:28 +08:00 -
7ee8001b7d
Update README.md
UncleCode
2024-05-09 21:49:04 +08:00 -
b9d9d2bbd4
chore: Update URL for single page fetch to NBC News
unclecode
2024-05-09 20:05:59 +08:00 -
6320d07a93
chore: Update landing page URL and min words threshold
unclecode
2024-05-09 20:05:31 +08:00 -
181250cb93
chore: Add function to clear the database
unclecode
2024-05-09 19:42:43 +08:00 -
f7c031c097
chore: Remove unused code from test.py
unclecode
2024-05-09 19:26:37 +08:00 -
51095062d4
Update file names
unclecode
2024-05-09 19:26:16 +08:00 -
c71adb29ce
chore: Update .gitignore and README.md
unclecode
2024-05-09 19:25:25 +08:00 -
898ec30a18
chore: Update license information in README.md
unclecode
2024-05-09 19:14:48 +08:00 -
343c4477f8
Update Crawl4AI web application URL in README.md
unclecode
2024-05-09 19:13:20 +08:00 -
99e0dd1ccd
chore: Update README.md with installation instructions for Crawl4AI library and local server
unclecode
2024-05-09 19:12:39 +08:00 -
b8e743cd8d
Initial Commit
unclecode
2024-05-09 19:10:25 +08:00