| # arf-api |
|
|
| ARF API Control Plane (FastAPI) |
|
|
| ## Live Demo |
|
|
| The API is deployed and accessible at: |
| - **Base URL**: [https://a-r-f-agentic-reliability-framework-api.hf.space](https://a-r-f-agentic-reliability-framework-api.hf.space) |
| - **Interactive Documentation**: [https://a-r-f-agentic-reliability-framework-api.hf.space/docs](https://a-r-f-agentic-reliability-framework-api.hf.space/docs) |
|
|
| ## Quick Start (Local Development) |
|
|
| 1. **Install dependencies**: |
| ```bash |
| pip install -r requirements.txt |
| ``` |
|
|
| Note: `requirements.txt` installs `agentic-reliability-framework` directly from the project's Git repository. |
|
|
| 2. **Set environment variables** (optional, in `.env`): |
|
|
| ```text |
| ARF_HMC_MODEL – path to HMC model JSON (default: models/hmc_model.json) |
| |
| ARF_USE_HYPERPRIORS – true/false |
| |
| API_KEY – optional (currently not enforced) |
| ``` |
|
|
| 3. **Run the app locally**: |
|
|
| ```bash |
| uvicorn app.main:app --reload --port 8000 |
| ``` |
|
|
| 4. **Health check**: |
|
|
| ```bash |
| GET http://localhost:8000/health |
| ``` |
|
|
| ## Causal Explainer Endpoint |
|
|
| The ARF API includes a heuristic causal explainer that evaluates the impact of proposed healing actions using deterministic rules. This module provides counterfactual reasoning without requiring a fitted causal model or external ML dependencies. |
|
|
| The explainer estimates how system metrics such as latency would change if a different action were taken. |
|
|
| ### Mathematical Model |
|
|
| The counterfactual outcome is computed as: |
|
|
| ```text |
| counterfactual_outcome = factual_outcome * (1 + effect_frac) |
| ``` |
|
|
| Where: |
|
|
| - `effect_frac` is a predefined impact factor based on the action type |
| - effects are multiplicative |
| - a fixed ±10% uncertainty interval is applied to the estimated outcome |
|
|
| ### Example Request |
|
|
| ```bash |
| curl -X POST "http://localhost:8000/api/v1/v1/incidents/evaluate" -H "Content-Type: application/json" -d '{ |
| "component": "checkout-service", |
| "latency_p99": 600, |
| "error_rate": 0.2, |
| "service_mesh": "default" |
| }' |
| ``` |
|
|
| ### Example Response |
|
|
| ```json |
| { |
| "healing_intent": { |
| "action": "restart_container", |
| "component": "checkout-service", |
| "parameters": {}, |
| "justification": "Causal: If we apply restart_container instead of no_action, latency would change from 600.00 to 510.00 (Δ = -90.00). Based on heuristic causal model.", |
| "confidence": 0.85, |
| "risk_score": 0.54, |
| "status": "oss_advisory_only" |
| }, |
| "causal_explanation": { |
| "factual_outcome": 600, |
| "counterfactual_outcome": 510, |
| "effect": -90, |
| "explanation_text": "If we apply restart_container instead of no_action, latency would change from 600.00 to 510.00 (Δ = -90.00). Based on heuristic causal model.", |
| "is_model_based": false, |
| "warnings": [ |
| "Using heuristic causal model (no fitted SCM)." |
| ] |
| }, |
| "utility_decision": { |
| "best_action": "restart_container", |
| "expected_utility": 0.5, |
| "explanation": "Heuristic decision based on latency/error thresholds" |
| } |
| } |
| ``` |
|
|
| ### Important Notes |
|
|
| - This endpoint is advisory only (`status = oss_advisory_only`) |
| - No Structural Causal Model (SCM) is fitted |
| - No machine learning models are used |
| - All effects are based on predefined heuristics |
|
|
| Tests |
| ----- |
|
|
| Run `pytest`. Tests use a temporary SQLite DB (`sqlite:///./test.db`) created by the test fixtures. |
|
|
| Notes |
| ----- |
|
|
| - The governance endpoints use an in-process `RiskEngine` initialized at startup. |
| - The outcome recording endpoint is not implemented in this repository and returns HTTP 501. |
|
|
|
|