unclecode
9b571bb947
feat: HTTP strategy detects and saves file downloads (CSV, PDF, etc.)
...
The HTTP crawler strategy now checks Content-Type and Content-Disposition
headers to detect non-HTML file responses. When a file download is
detected, raw bytes are saved to disk and the path is returned via
downloaded_files. Text-based files (CSV, JSON, XML) also populate the
html field for backward compatibility. Binary files (PDF, images) set
html to empty string — content is only available via downloaded_files.
Adds downloads_path to HTTPCrawlerConfig (defaults to ~/.crawl4ai/downloads/).
2026-03-16 14:03:43 +00:00
..
2026-03-07 09:47:38 +00:00
2026-03-16 14:03:43 +00:00
2025-06-23 10:44:27 +08:00
2026-02-25 07:12:28 +00:00
2026-01-17 14:19:15 +01:00
2025-12-10 10:12:01 -07:00
2026-01-22 06:08:25 +00:00
2025-06-09 11:49:33 +08:00
2026-02-25 07:12:28 +00:00
2026-02-18 06:44:17 +00:00
2025-02-07 21:56:27 +08:00
2025-02-28 19:53:35 +08:00
2025-04-22 22:35:25 +08:00
2025-08-28 17:21:49 +08:00
2025-06-12 14:38:32 +03:00
2026-02-20 10:07:59 +00:00
2026-03-08 03:20:52 +00:00
2025-07-11 22:27:18 +08:00
2026-02-25 07:12:28 +00:00
2024-05-14 21:27:41 +08:00
2026-01-17 11:06:44 +01:00
2025-08-09 19:37:22 +05:30
2025-08-02 19:10:36 +08:00
2026-03-07 08:45:11 +00:00
2026-01-31 11:07:26 +00:00
2025-01-13 19:19:58 +08:00
2026-03-09 14:52:58 +00:00
2026-01-31 11:44:07 +00:00
2025-08-02 19:10:36 +08:00
2025-08-02 19:10:36 +08:00
2025-08-05 14:09:54 +08:00
2025-04-29 16:26:35 +02:00
2026-03-12 14:23:34 +08:00
2026-03-12 18:17:13 +08:00
2026-03-12 20:00:33 +08:00
2026-03-12 11:22:48 +00:00
2026-03-12 15:53:04 +08:00
2026-02-25 06:52:53 +03:00
2026-03-12 12:04:45 +08:00
2025-07-17 11:35:16 +02:00
2025-11-06 11:22:45 +01:00
2025-08-15 18:47:31 +08:00
2025-01-13 19:19:58 +08:00
2025-04-29 16:26:35 +02:00
2025-08-17 19:14:55 +08:00
2026-02-16 20:41:30 +05:30
2025-08-03 16:50:54 +08:00
2025-05-19 13:51:16 +08:00
2026-03-07 08:45:11 +00:00
2026-02-06 09:30:19 +00:00
2026-02-06 09:30:19 +00:00
2026-03-07 08:45:11 +00:00
2026-01-17 14:19:15 +01:00
2026-01-17 14:19:15 +01:00
2026-01-17 14:19:15 +01:00
2025-08-28 17:38:40 +08:00
2025-10-23 06:57:25 +00:00
2025-10-23 06:57:25 +00:00
2026-01-17 14:19:15 +01:00
2026-01-17 14:19:15 +01:00
2026-01-20 00:45:15 +00:00
2025-02-28 19:53:35 +08:00
2026-02-17 09:04:40 +00:00
2026-02-17 21:14:36 -05:00
2025-06-29 20:41:37 +08:00
2025-06-10 18:08:27 +08:00
2025-10-22 00:35:07 +00:00
2025-10-22 00:35:07 +00:00