Overview
This is a Cloudflare protection bypass service that provides web scraping and bot protection circumvention capabilities. The application acts as an API service that uses headless browser automation (via Puppeteer) to solve Cloudflare challenges, including Turnstile CAPTCHAs and WAF (Web Application Firewall) sessions. It exposes multiple endpoints for different bypass strategies and can operate with or without proxy support.
User Preferences
Preferred communication style: Simple, everyday language.
System Architecture
Backend Architecture
Framework: Express.js REST API server
- Problem: Need to provide multiple browser automation endpoints as HTTP services
- Solution: Express server with JSON body parsing, CORS support, and configurable timeout handling
- Rationale: Express provides a lightweight, flexible framework for creating API endpoints with middleware support for authentication and request validation
Request Handling Pattern: Unified handler with mode-based routing
- Problem: Multiple similar endpoints with shared authentication and rate limiting logic
- Solution: Single
handleSolverRequestfunction that routes based onmodeparameter - Alternatives: Separate route handlers per endpoint
- Pros: Centralized validation, authentication, and rate limiting; reduced code duplication
- Cons: Single handler must accommodate different parameter requirements per mode
Global State Management: Module-level globals for browser instance and rate limiting
- Problem: Need to share browser instance and track concurrent requests across all handlers
- Solution: Global variables (
global.browser,global.browserLength,global.browserLimit) - Rationale: Browser instance is expensive to create; rate limiting requires shared state
- Cons: Not horizontally scalable; state is lost on restart
Browser Automation
Browser Management: Puppeteer with automatic reconnection
- Problem: Headless browsers can crash or disconnect unpredictably
- Solution: Auto-reconnection logic in
createBrowser.jswith exponential backoff - Technology:
puppeteer-real-browserpackage for enhanced stealth capabilities - Features: Turnstile-ready configuration, XVFB support for headless environments, real browser fingerprinting
Context Isolation: Browser contexts per request
- Problem: Need to isolate each request's cookies, sessions, and proxy settings
- Solution: Create new browser context for each request with independent proxy configuration
- Pros: Complete isolation between concurrent requests; clean state per request
- Cons: Higher resource usage than tab-based isolation
Timeout Management: Per-request timeout with cleanup
- Problem: Browser automation can hang indefinitely on problematic sites
- Solution: Configurable timeout (
global.timeOut) with context cleanup in all endpoints - Rationale: Prevents resource leaks and ensures predictable response times
API Endpoints & Modes
Mode-Based Architecture: Six distinct bypass strategies
- source: Fetch page HTML after Cloudflare challenge resolution
- turnstile-min: Solve Turnstile CAPTCHA using injected fake page with real site key
- turnstile-max: Solve Turnstile on actual target page
- waf-session: Create authenticated session with Cloudflare cookies and headers
- proxy-request: Make authenticated requests using cookies/headers from waf-session (session reuse)
Session Reuse Workflow (proxy-request):
- First call
waf-sessionto get cookies + headers - Pass those to
proxy-requestviacookiesandsessionHeadersparameters - Requests are made through the same browser (Chrome fingerprint), maintaining CF clearance
Request Validation: JSON Schema validation with AJV
- Problem: Need to validate complex nested request structures (proxy configs, cookies, headers)
- Solution: AJV with format validation for URIs and enums for modes/HTTP methods
- Rationale: Schema-based validation provides clear error messages and type safety
Security & Rate Limiting
Authentication: Optional token-based authentication
- Implementation:
authTokenenvironment variable compared against request parameter - Design: Simple bearer-style token authentication
- Rationale: Lightweight protection for self-hosted deployments
Rate Limiting: Browser instance concurrency limits
- Problem: Too many concurrent browser contexts can exhaust system resources
- Solution:
browserLimit(default 20) enforced viabrowserLengthcounter - Mechanism: Request rejected with 429 if limit exceeded
- Cons: In-memory counter doesn't work across multiple instances
Proxy Authentication: Built-in proxy credential handling
- Problem: Many proxies require username/password authentication
- Solution: Puppeteer's
page.authenticate()for proxy credentials - Support: Configurable per-request proxy with optional authentication
Environment Configuration
Configuration Strategy: Environment variables with sensible defaults
PORT: Server port (default: 3939)authToken: Optional API authenticationbrowserLimit: Max concurrent browser contexts (default: 20)timeOut: Request timeout in milliseconds (default: 60000)SKIP_LAUNCH: Skip browser initialization for testingNODE_ENV: Environment mode (development vs production)
External Dependencies
Core Dependencies
puppeteer-real-browser (v1.4.0)
- Purpose: Enhanced Puppeteer wrapper with anti-detection features
- Key Features: Turnstile challenge solving, real browser fingerprinting, XVFB support
- Integration: Global browser instance managed in
module/createBrowser.js
Express (v4.21.0)
- Purpose: HTTP server framework
- Integration: Main application server with middleware stack
body-parser (v1.20.3)
- Purpose: Parse JSON and URL-encoded request bodies
- Integration: Middleware for POST request handling
cors (v2.8.5)
- Purpose: Enable Cross-Origin Resource Sharing
- Integration: Middleware to allow API access from web clients
Validation & Schema
ajv (v8.17.1) + ajv-formats (v3.0.1)
- Purpose: JSON Schema validation with format extensions
- Integration: Request parameter validation in
module/reqValidate.js - Schemas: Validates URL formats, proxy configs, HTTP methods, cookie structures
Testing
jest (v29.7.0)
- Purpose: Test framework
- Configuration: Tests located in
tests/directory with verbose output
supertest (v7.0.0)
- Purpose: HTTP assertion library for API testing
- Integration: Testing Express endpoints
External Services
Cloudflare Turnstile API
- Integration: Loaded via CDN script in fake page template
- URL:
https://challenges.cloudflare.com/turnstile/v0/api.js - Purpose: CAPTCHA challenge rendering and token generation
httpbin.org
- Purpose: Header detection service used in
wafSession.jsto extract Accept-Language header - Endpoint:
https://httpbin.org/get - Usage: Validate browser fingerprinting fidelity