# Simple Production Anti-Bot Strategy This document replaces the overly-complex idea of forcing perfect Chrome TLS impersonation inside the core engine. ## Principle **Do not make the core plugin engine a fragile browser clone.** Keep the BEX engine: - portable - buildable everywhere - easy to embed in C++ apps - deterministic where possible - independent from experimental TLS impersonation crates Then add simple challenge handling around it. ## Recommended Flow ```text Plugin request ↓ BEX normal HTTP backend ↓ Success? ────────────────→ return data ↓ no Challenge detected? ↓ yes Return CHALLENGE_REQUIRED with URL/domain/reason ↓ C++ app decides fallback: - use cached cookies - ask user to import cookies - open system browser/WebView only when needed - use app-specific HTTP fetcher - use optional proxy service ``` ## Why This Is Better Perfect Chrome impersonation is not simple: - TLS JA3/JA4 changes with Chrome versions. - HTTP/2 fingerprints change. - Libraries using BoringSSL are harder to cross-compile. - Mobile/iOS/Android builds need separate proof. - One wrong cipher order or H2 setting can still get blocked. - CAPTCHA/Turnstile still cannot be solved silently. For an engine that must be used inside **many C++ apps**, the stable approach is: - use portable Rust HTTP by default - detect challenge pages reliably - delegate rare hard anti-bot cases to the host app ## Challenge Detection A response should be treated as anti-bot/challenge if any of these are true: ### Status codes - `403` - `429` - `503` ### Headers - `server: cloudflare` - `cf-ray` - `cf-chl-*` - `x-datadome` - `x-perimeterx` - `akamai-*` ### Body markers - `Just a moment...` - `Checking your browser` - `cf-browser-verification` - `cf-chl-` - `turnstile` - `captcha` - `datadome` - `px-captcha` ## Engine-Level Behavior The BEX engine should not try to solve every challenge itself. Instead: 1. Detect likely challenge. 2. Return structured error: ```json { "code": "CHALLENGE_REQUIRED", "url": "https://example.com/path", "final_url": "https://example.com/cdn-cgi/challenge-platform/...", "status": 403, "provider": "cloudflare", "domain": "example.com", "hint": "Host app should provide cookies or browser-backed fetch." } ``` 3. Host app can then retry with cookies or a browser-backed fetcher. ## Simple Fallback Options ### Option A — User-provided cookies The app allows the user to paste/export cookies for a domain. Then plugins can send: ```http Cookie: cf_clearance=...; session=... ``` This is simple, cross-platform, and avoids hidden browser automation. ### Option B — App-level browser session The app opens a system browser/WebView **only when needed**. After challenge is solved, app stores cookies in BEX secret/KV store. Future requests use those cookies and avoid WebView. ### Option C — External fetcher callback Expose an optional C ABI hook: ```c typedef bool (*BexExternalFetch)( void* user_data, const char* method, const char* url, const uint8_t* body, size_t body_len, BexFetchResult* out ); ``` Then the host app can provide: - libcurl-impersonate - platform-native HTTP stack - browser-backed fetch - company proxy - Android/iOS native networking The core engine stays simple. ### Option D — Optional proxy service For apps that control their backend, route difficult sites through a server-side fetcher with proper browser fingerprinting. The engine stays portable and does not embed fragile anti-bot logic. ## Plugin Guidance Plugins should: - set `Referer` correctly - preserve cookies when provided - avoid excessive retries - return `PluginError::Forbidden` or `PluginError::RateLimited` for challenge pages - prefer local JS ciphers over third-party helper APIs when possible Plugins should not: - hardcode fake TLS assumptions - rely on one external decoder service forever - endlessly retry CF challenge pages ## Recommended Near-Term Fixes 1. Add challenge detection in `HttpHostService`. 2. Map challenges to a structured error payload for C ABI. 3. Add cookie helper APIs: - set domain cookies - clear domain cookies - list stored challenge domains 4. Add optional external fetch callback in C ABI. 5. Keep advanced TLS impersonation as an optional backend only. ## Final Recommendation For production: - Default: `reqwest + rustls` portable backend. - Add: challenge detection and external fallback hook. - Optional later: verified impersonation backend behind feature flag. This gives the best balance of: - reliability - cross-platform support - maintainability - app integration flexibility - real-world anti-bot handling