File size: 2,904 Bytes
24f0bf0
df47251
24f0bf0
df47251
 
 
24f0bf0
df47251
 
 
 
 
 
 
 
24f0bf0
df47251
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
24f0bf0
df47251
 
 
 
 
 
 
 
24f0bf0
df47251
 
 
 
 
 
 
 
24f0bf0
df47251
 
 
 
 
 
 
 
 
 
 
 
 
 
 
24f0bf0
df47251
 
 
 
 
 
24f0bf0
df47251
 
 
 
 
 
24f0bf0
df47251
 
 
 
 
 
24f0bf0
df47251
 
 
 
 
 
24f0bf0
df47251
 
 
 
 
 
24f0bf0
df47251
 
24f0bf0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
# advanced-features

## overview

This document captures high-end platform capabilities beyond baseline extraction.

## 1-self-improving-agent

Post-episode learning loop:

- classify failures by root cause
- update selector/tool strategy priors
- persist successful patterns with confidence
- penalize repeated failure paths

## 2-strategy-library

Built-in strategies:

- Search-first
- Direct extraction
- Multi-hop reasoning
- Verification-first
- Table-first

Each strategy tracks:

- win rate
- cost per success
- average latency
- domain affinity

## 3-explainable-ai-mode

For every decision, provide:

- selected action and confidence
- top alternatives considered
- evidence from memory/tools/search
- expected reward impact

## 4-human-in-the-loop

Intervention controls:

- approve/reject action
- force tool/model switch
- enforce verification before submit
- set hard constraints during runtime

## 5-scenario-simulator

Stress testing scenarios:

- noisy HTML
- broken DOM
- pagination traps
- conflicting facts
- anti-scraping patterns

Outputs:

- robustness score
- recovery score
- strategy suitability map

## 6-context-compression

- rolling summaries
- salience-based pruning
- token-aware context packing
- differential memory refresh

## 7-batch-parallel-runtime

- task queue with priorities
- parallel extraction workers
- bounded concurrency
- idempotent retry handling

## 8-prompt-versioning-and-evaluation

- versioned prompt templates
- A/B testing by task type
- reward/cost comparison dashboards
- rollout and rollback controls

## 9-mcp-toolchain-composition

Composable flow examples:

- Browser MCP -> Parser MCP -> Validator MCP -> DB MCP
- Search MCP -> Fetch MCP -> Extract MCP -> Verify MCP

## 10-governance-and-safety

- tool allowlist/denylist
- PII redaction in logs
- budget and rate guardrails
- provenance tracking for extracted facts

## feature-flags

All advanced features should be toggleable from Settings and safely disabled by default where cost/latency impact is high.

## api-driven-feature-map

| feature-domain | endpoint-surface |
| --- | --- |
| agent planning and execution | `/api/agents/run`, `/api/agents/plan`, `/api/agents/message` |
| dynamic scraping | `/api/scrape/stream`, `/api/scrape/`, `/api/scrape/sessions` |
| memory operations | `/api/memory/store`, `/api/memory/query`, `/api/memory/consolidate` |
| tool and plugin usage | `/api/tools/registry`, `/api/plugins/tools`, `/api/plugins/install` |
| model and provider controls | `/api/settings/model`, `/api/providers/models/all`, `/api/providers/costs/summary` |

See `api-reference.md` for full endpoint signatures.

## document-metadata

| key | value |
| --- | --- |
| document | `features.md` |
| status | active |

## document-flow

```mermaid
flowchart TD
    A[document] --> B[key-sections]
    B --> C[implementation]
    B --> D[operations]
    B --> E[validation]
```