$ timeahead.in
/ servers/pypi/scrapling
pypi

scrapling

🕷️ An adaptive Web Scraping framework that handles everything from a single request to a full-scale crawl!

65k stars166k/wkupdated 3d agogithub ↗
95excellent
▣ Overview

What it does

A web scraping framework that scales from single-page extraction to multi-domain crawls. Scrapling provides multiple fetcher strategies—standard HTTP, stealth mode for anti-bot evasion, and dynamic rendering for JavaScript-heavy sites—and automatically bypasses protections like Cloudflare Turnstile. The parser uses CSS and XPath selectors with an adaptive mode that relocates extracted elements when website layouts change. For large-scale work, the spider framework orchestrates concurrent, multi-session crawls with pause/resume capabilities and automatic proxy rotation. Includes streaming support and real-time metrics.

Who it's for

Data engineers and researchers building scrapers that scale. Teams extracting structured data from sites with aggressive anti-bot protections. Engineers whose extraction scripts break when target websites redesign. Useful in data validation and monitoring workflows where scraped data feeds downstream systems.

Common use cases

  • Fetch content from sites protected by Cloudflare Turnstile or similar anti-scraping barriers.
  • Extract e-commerce product listings, pricing, or availability that survive website redesigns using adaptive parsing.
  • Build crawlers that scale from single-page requests to thousands of concurrent sessions with automatic pause/resume.
  • Rotate through proxy networks for high-volume data collection across multiple domains.
  • Monitor content changes in real time with streaming result delivery.

Setup pitfalls

  • Requires filesystem read/write permissions to store crawl state, logs, and response caches.
  • JavaScript-rendered content requires headless browser setup; browser timeouts and network idle detection require tuning.
  • Proxy rotation depends on external infrastructure; no built-in proxy service included.
  • Anti-bot systems evolve; evasion techniques may become obsolete as websites update defenses.
▣ Score BreakdownMCPScore = Σ(raw × weight)
DimensionRawWeighted
Security
35%
100
35.0
Freshness
25%
100
25.0
Adoption
20%
100
20.0
Quality
10%
100
10.0
Trust
10%
50
5.0
Total
95.0
⚿ Capabilities & Risk Explainer
fs readfs writenetworksecrets
◆ Risk level: medium
fs read + fs write + network + secrets — requires access to credentials or environment secrets.
⚙ Install config
Claude Desktop · Cursor · Windsurf · VS Code (Copilot) · Claude Code
add to your MCP client config:
{
  "mcpServers": {
    "scrapling-1": {
      "command": "uvx",
      "args": [
        "scrapling"
      ]
    }
  }
}
📈 Score historylast 40 snapshots
5/10/20266/21/2026 · 40 snapshots
⚙ Maintenance health
59/ 100 · is this project alive?
contributors (1y)21
top contributor share95%
releases (1y)25
last release13d ago
ci✓ passing
⛁ Raw data
weekly downloads166k
github stars65k
forks6k
open issues9
license✓ present
readme length31233 chars
last publish0d ago
last commit3d ago
last updated7d ago
install verified✓ pass · 42d ago
owner of this server? claim your listing to get a verified badgeclaim →
🔔 Score drop alerts
get notified by email when this server's score drops 5+ points