Spaces:
Running
Running
| # overview | |
| ## purpose | |
| This document is the top-level guide for the ScrapeRL documentation set. It explains what the platform does, how the main runtime surfaces connect, and where to find detailed references. | |
| ## platform-summary | |
| | dimension | summary | | |
| | --- | --- | | |
| | core-goal | AI-first scraping workflows with RL-style episodes and dynamic agent planning | | |
| | backend | FastAPI control plane with episode, scrape, agent, plugin, memory, and provider APIs | | |
| | frontend | React dashboard for task submission, stream monitoring, and result inspection | | |
| | runtime-pattern | session-based execution with real-time `step`/`tool_call` stream events | | |
| | output-targets | `json`, `csv`, `markdown`, and `text` | | |
| | integrations | OpenAI, Anthropic, Google, Groq, NVIDIA, plugin tools, memory layers | | |
| ## primary-runtime-flows | |
| ```mermaid | |
| flowchart TD | |
| A[user-request] --> B[api-scrape-stream] | |
| B --> C[agent-decision] | |
| C --> D[tool-plan-and-execution] | |
| D --> E[llm-extraction-and-formatting] | |
| E --> F[complete-event] | |
| B --> G[session-status-and-artifacts] | |
| ``` | |
| ## documentation-navigation | |
| | doc | focus-area | | |
| | --- | --- | | |
| | `readme.md` | documentation index | | |
| | `api-reference.md` | complete endpoint catalog and stream/event contract | | |
| | `architecture.md` | system topology, subsystem planes, reliability model | | |
| | `openenv.md` | environment/action/observation/reward contract | | |
| | `features.md` | advanced runtime features and toggles | | |
| | `memory.md` | memory layers, storage, and operations | | |
| | `plugins.md` | plugin registry and runtime tool-selection model | | |
| | `tool-calls.md` | tool call payload schema and lifecycle | | |
| | `api.md` | multi-model routing and provider behavior | | |
| | `settings.md` | runtime setting controls and policy knobs | | |
| | `observability.md` | telemetry/tracing/cost visibility | | |
| | `rewards.md` | reward design and scoring structure | | |
| | `search-engine.md` | search provider and retrieval routing details | | |
| | `mcp.md` | mcp integration architecture | | |
| | `agents.md` | agent roles and coordination model | | |
| ## key-api-surfaces | |
| | surface | endpoints | | |
| | --- | --- | | |
| | system-health | `/api/health`, `/api/ready`, `/api/ping` | | |
| | episode-runtime | `/api/episode/reset`, `/api/episode/step`, `/api/episode/state/{episode_id}` | | |
| | scrape-runtime | `/api/scrape/stream`, `/api/scrape/{session_id}/status`, `/api/scrape/{session_id}/result` | | |
| | agent-tool-memory | `/api/agents/*`, `/api/tools/*`, `/api/plugins/*`, `/api/memory/*` | | |
| | realtime-channel | `/ws/episode/{episode_id}` | | |
| Use `api-reference.md` for full method/path listings. | |
| ## configuration-surfaces | |
| | file | intent | | |
| | --- | --- | | |
| | `.env.example` | complete variable template for app + inference runtime | | |
| | `.env` | local runtime values | | |
| | `docker-compose.yml` | backend/frontend orchestration and env wiring | | |
| | `inference.py` | OpenEnv-compliant inference entrypoint and stdout contract | | |
| ## recommended-reading-order | |
| 1. `overview.md` | |
| 2. `api-reference.md` | |
| 3. `architecture.md` | |
| 4. `openenv.md` | |
| 5. `tool-calls.md` | |
| 6. `plugins.md` | |
| 7. domain docs (`memory.md`, `api.md`, `features.md`, `settings.md`) | |
| ## document-metadata | |
| | key | value | | |
| | --- | --- | | |
| | document | `overview.md` | | |
| | status | active | | |
| | owner | platform-docs | | |