File size: 3,463 Bytes
afa4de7
6d06d91
afa4de7
7e6d24f
afa4de7
7e6d24f
afa4de7
 
 
e607227
afa4de7
7e6d24f
afa4de7
 
 
 
4cd6be7
afa4de7
e607227
afa4de7
7e6d24f
afa4de7
 
7e6d24f
afa4de7
1ad8d1a
afa4de7
 
1ad8d1a
afa4de7
c6bda33
afa4de7
 
c6bda33
1ad8d1a
afa4de7
 
 
 
 
c6bda33
afa4de7
c6bda33
afa4de7
4cd6be7
afa4de7
c6bda33
afa4de7
c6bda33
afa4de7
c6bda33
 
afa4de7
e607227
 
afa4de7
 
 
 
 
e607227
afa4de7
c6bda33
 
afa4de7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7fc9652
1ad8d1a
afa4de7
 
 
 
 
 
 
 
 
 
 
1ad8d1a
afa4de7
 
1ad8d1a
afa4de7
 
4cd6be7
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
# arf-api

ARF API Control Plane (FastAPI)

## Live Demo

The API is deployed and accessible at:
- **Base URL**: [https://a-r-f-agentic-reliability-framework-api.hf.space](https://a-r-f-agentic-reliability-framework-api.hf.space)
- **Interactive Documentation**: [https://a-r-f-agentic-reliability-framework-api.hf.space/docs](https://a-r-f-agentic-reliability-framework-api.hf.space/docs)

## Quick Start (Local Development)

1. **Install dependencies**:
```bash
pip install -r requirements.txt
```

Note: `requirements.txt` installs `agentic-reliability-framework` directly from the project's Git repository.

2. **Set environment variables** (optional, in `.env`):

```text
ARF_HMC_MODEL – path to HMC model JSON (default: models/hmc_model.json)

ARF_USE_HYPERPRIORS – true/false

API_KEY – optional (currently not enforced)
```

3. **Run the app locally**:

```bash
uvicorn app.main:app --reload --port 8000
```

4. **Health check**:

```bash
GET http://localhost:8000/health
```

## Causal Explainer Endpoint

The ARF API includes a heuristic causal explainer that evaluates the impact of proposed healing actions using deterministic rules. This module provides counterfactual reasoning without requiring a fitted causal model or external ML dependencies.

The explainer estimates how system metrics such as latency would change if a different action were taken.

### Mathematical Model

The counterfactual outcome is computed as:

```text
counterfactual_outcome = factual_outcome * (1 + effect_frac)
```

Where:

- `effect_frac` is a predefined impact factor based on the action type
- effects are multiplicative
- a fixed ±10% uncertainty interval is applied to the estimated outcome

### Example Request

```bash
curl -X POST "http://localhost:8000/api/v1/v1/incidents/evaluate"   -H "Content-Type: application/json"   -d '{
    "component": "checkout-service",
    "latency_p99": 600,
    "error_rate": 0.2,
    "service_mesh": "default"
  }'
```

### Example Response

```json
{
  "healing_intent": {
    "action": "restart_container",
    "component": "checkout-service",
    "parameters": {},
    "justification": "Causal: If we apply restart_container instead of no_action, latency would change from 600.00 to 510.00 (Δ = -90.00). Based on heuristic causal model.",
    "confidence": 0.85,
    "risk_score": 0.54,
    "status": "oss_advisory_only"
  },
  "causal_explanation": {
    "factual_outcome": 600,
    "counterfactual_outcome": 510,
    "effect": -90,
    "explanation_text": "If we apply restart_container instead of no_action, latency would change from 600.00 to 510.00 (Δ = -90.00). Based on heuristic causal model.",
    "is_model_based": false,
    "warnings": [
      "Using heuristic causal model (no fitted SCM)."
    ]
  },
  "utility_decision": {
    "best_action": "restart_container",
    "expected_utility": 0.5,
    "explanation": "Heuristic decision based on latency/error thresholds"
  }
}
```

### Important Notes

- This endpoint is advisory only (`status = oss_advisory_only`)
- No Structural Causal Model (SCM) is fitted
- No machine learning models are used
- All effects are based on predefined heuristics

Tests
-----

Run `pytest`. Tests use a temporary SQLite DB (`sqlite:///./test.db`) created by the test fixtures.

Notes
-----

- The governance endpoints use an in-process `RiskEngine` initialized at startup.
- The outcome recording endpoint is not implemented in this repository and returns HTTP 501.