Firecrawl
Web scraping and crawling API that turns websites into clean, LLM-ready markdown and structured data.
Firecrawl is rebuilding web data around agents and a brutal token economy
◆Recent moves
- 8d ago
Firecrawl Research Index
⚡ SPARKFirecrawl extends from scraping into retrieval with a specialized arXiv index claiming 53.3% recall on arXivQA, ahead of the next provider, bundling 3M+ papers with their GitHub code refreshed daily. It is the clearest sign the platform is moving up the stack from fetching pages to serving grounded answers.
- 29d ago
Introducing /monitor
⚡ SPARK/monitor turns Firecrawl into a change-detection service for agents: describe what to watch in plain English and it configures URLs, schema, and cadence, then webhooks the agent only when something changes. It fits the token-economy thesis by ingesting deltas rather than full pages.
- 1mo ago
v2.10 is live
The v2.10 roundup ships the /parse endpoint, Lockdown Mode, the Question and Highlights formats, and four new SDKs (Go, Ruby, PHP, .NET), consolidating the quarter's token-efficiency and parsing work into one release. Most pieces were announced individually; this is the packaging and reliability pass.
- 1mo ago
Highlights Format
Highlights returns only the exact sentences, code blocks, and table rows matching a query, verbatim with no rewriting, at up to 100x fewer tokens. It is another increment in the format line built to cut what agents pay to read a page.
- 1mo ago
Question Format
Question returns a grounded answer drawn strictly from a page instead of the page itself, at up to 100x fewer tokens. Paired with Highlights, it rounds out a format family aimed at collapsing the scrape-parse-prompt pipeline into a single call.
- 1mo ago
Lockdown Mode
Lockdown Mode serves /scrape results only from Firecrawl's index, with no outbound requests and zero data retention by default. It is a privacy and security option aimed at making the platform usable for sensitive enterprise agent workloads.