Marketplace

Unnamed Skill

Firecrawl v2.5 API for web scraping/crawling to LLM-ready markdown. Use for site extraction, dynamic content, or encountering JavaScript rendering, bot detection, content loading errors.

$ 安裝

git clone https://github.com/secondsky/claude-skills /tmp/claude-skills && cp -r /tmp/claude-skills/plugins/firecrawl-scraper/skills/firecrawl-scraper ~/.claude/skills/claude-skills

// tip: Run this command in your terminal to install the skill


name: firecrawl-scraper description: Firecrawl v2.5 API for web scraping/crawling to LLM-ready markdown. Use for site extraction, dynamic content, or encountering JavaScript rendering, bot detection, content loading errors.

Keywords: firecrawl, firecrawl api, web scraping, web crawler, scrape website, crawl website, extract content, html to markdown, site crawler, content extraction, web automation, firecrawl-py, firecrawl-js, llm ready data, structured data extraction, bot bypass, javascript rendering, scraping api, crawling api, map urls, batch scraping license: MIT

Firecrawl Web Scraper Skill

Status: Production Ready ✅ Last Updated: 2025-11-21 Official Docs: https://docs.firecrawl.dev API Version: v2.5


What is Firecrawl?

Firecrawl is a Web Data API for AI that turns entire websites into LLM-ready markdown or structured data. It handles:

  • JavaScript rendering - Executes client-side JavaScript to capture dynamic content
  • Anti-bot bypass - Gets past CAPTCHA and bot detection systems
  • Format conversion - Outputs as markdown, JSON, or structured data
  • Screenshot capture - Saves visual representations of pages
  • Browser automation - Full headless browser capabilities

API Endpoints

1. /v2/scrape - Single Page Scraping

Scrapes a single webpage and returns clean, structured content.

Use Cases:

  • Extract article content
  • Get product details
  • Scrape specific pages
  • Convert HTML to markdown

Key Options:

  • formats: ["markdown", "html", "screenshot"]
  • onlyMainContent: true/false (removes nav, footer, ads)
  • waitFor: milliseconds to wait before scraping
  • actions: browser automation actions (click, scroll, etc.)

2. /v2/crawl - Full Site Crawling

Crawls all accessible pages from a starting URL.

Use Cases:

  • Index entire documentation sites
  • Archive website content
  • Build knowledge bases
  • Scrape multi-page content

Key Options:

  • limit: max pages to crawl
  • maxDepth: how many links deep to follow
  • allowedDomains: restrict to specific domains
  • excludePaths: skip certain URL patterns

3. /v2/map - URL Discovery

Maps all URLs on a website without scraping content.

Use Cases:

  • Find sitemap
  • Discover all pages
  • Plan crawling strategy
  • Audit website structure

4. /v2/extract - Structured Data Extraction

Uses AI to extract specific data fields from pages.

Use Cases:

  • Extract product prices and names
  • Parse contact information
  • Build structured datasets
  • Custom data schemas

Key Options:

  • schema: Zod or JSON schema defining desired structure
  • systemPrompt: guide AI extraction behavior

Authentication

Firecrawl requires an API key for all requests.

Get API Key

  1. Sign up at https://www.firecrawl.dev
  2. Go to dashboard → API Keys
  3. Copy your API key (starts with fc-)

Store Securely

NEVER hardcode API keys in code!

# .env file
FIRECRAWL_API_KEY=fc-your-api-key-here
# .env.local (for local development)
FIRECRAWL_API_KEY=fc-your-api-key-here

SDK Quick Start

Python

pip install firecrawl-py  # v4.5.0+
from firecrawl import FirecrawlApp
import os

app = FirecrawlApp(api_key=os.environ.get("FIRECRAWL_API_KEY"))
result = app.scrape_url("https://example.com", params={"formats": ["markdown"], "onlyMainContent": True})
print(result.get("markdown"))

TypeScript/Node.js

bun add @mendable/firecrawl-js  # v4.4.1+
import FirecrawlApp from '@mendable/firecrawl-js';

const app = new FirecrawlApp({ apiKey: process.env.FIRECRAWL_API_KEY });
const result = await app.scrapeUrl('https://example.com', { formats: ['markdown'], onlyMainContent: true });
console.log(result.markdown);

See: templates/ for crawl, extract, and advanced examples


Common Use Cases

Use CaseEndpointKey Options
Documentation scrapingcrawl_url()limit: 500, allowedDomains
Product data extractionextract()Zod schema + systemPrompt
News article scrapingscrape_url()onlyMainContent: true, removeBase64Images
URL discoverymap()Find all pages before crawling

See: references/common-patterns.md for complete examples.


Error Handling

# Python
try:
    result = app.scrape_url("https://example.com")
except FirecrawlException as e:
    print(f"Firecrawl error: {e}")
// TypeScript
try {
  const result = await app.scrapeUrl('https://example.com');
} catch (error) {
  console.error('Error:', error.message);
}

Rate Limits & Best Practices

Best PracticeWhy
Use onlyMainContent: trueReduces credits, cleaner output
Set reasonable limitAvoid excessive costs
Use map endpoint firstPlan crawling strategy
Cache resultsAvoid re-scraping
Batch extract callsMore efficient for multiple URLs

Credits: Free tier = 500/month, paid tiers higher.


Cloudflare Workers Integration

⚠️ SDK cannot run in Workers (Node.js dependencies). Use direct REST API:

const response = await fetch('https://api.firecrawl.dev/v2/scrape', {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${env.FIRECRAWL_API_KEY}`,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({ url, formats: ['markdown'], onlyMainContent: true })
});

See: references/common-patterns.md for complete Workers example with caching.


When to Use This Skill

✅ Use Firecrawl❌ Don't Use
Modern JS-rendered sitesSimple static HTML (use cheerio)
Clean markdown for LLMsExisting Puppeteer setup works
RAG/chatbot contentDirect API available
Structured data extractionBudget constraints
Bot protection bypass

Common Issues

IssueCauseFix
"Invalid API Key"Key not setCheck $FIRECRAWL_API_KEY starts with fc-
"Rate limit exceeded"Monthly credits usedCheck dashboard, upgrade plan
"Timeout error"Page slow to loadAdd waitFor: 10000
"Content is empty"JS loads lateAdd actions: [{type: "wait", milliseconds: 3000}]

Advanced Features

FeatureUsage
Browser actionsactions: [{type: "click", selector: "button"}]
Custom headersheaders: {"User-Agent": "Custom Bot"}
Webhookswebhook: "https://your-domain.com/webhook"
Screenshotsformats: ["screenshot"]

See: references/endpoints.md for complete API reference.


When to Load References

ReferenceLoad When...
endpoints.mdNeed complete API endpoint documentation
common-patterns.mdCloudflare Workers, caching, batch processing, error handling

Package Versions

PackageVersion
firecrawl-py4.5.0+
@mendable/firecrawl-js4.4.1+
APIv2

Note: Node.js SDK requires Node.js >=22.0.0, cannot run in Workers.


Official Docs: https://docs.firecrawl.dev | GitHub: https://github.com/mendableai/firecrawl

Token Savings: ~60% | Production Ready: ✅