Spaces:
Sleeping
Sleeping
Commit
·
33be1ce
1
Parent(s):
a08910d
Deploy 2026-02-03 08:46:33
Browse files- README.md +19 -226
- src/flow/experiments/models.py +9 -2
- src/flow/harness/langgraph/harness.py +33 -9
- src/flow/harness/miniagent/harness.py +108 -9
- src/flow/ui/schemas/config.py +10 -3
README.md
CHANGED
|
@@ -1,237 +1,30 @@
|
|
| 1 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2 |
|
| 3 |
-
|
| 4 |
-
> Flow is an experimental prototype and changing rapidly.
|
| 5 |
|
| 6 |
Flow helps you find the best configuration for your AI coding agent. Define your agent spec, provide evaluation tasks, and Flow automatically generates variants, scores them, and shows you the quality vs. cost tradeoffs.
|
| 7 |
|
| 8 |
- **Simplified experimentation** — Automates the search for optimal agent configurations
|
| 9 |
- **Transparency** — See exactly what was tested, scores, and tradeoffs on a Pareto chart
|
| 10 |
- **User control** — Choose your tasks, evaluation criteria, and approve variants
|
| 11 |
-
- **Framework agnostic** — Standardized agent spec with pluggable runtime adapters
|
| 12 |
-
|
| 13 |
-

|
| 14 |
-
|
| 15 |
-
## How It Works
|
| 16 |
-
|
| 17 |
-
```mermaid
|
| 18 |
-
flowchart LR
|
| 19 |
-
A[Agent Spec] --> D[Optimizer]
|
| 20 |
-
B[Tasks] --> D
|
| 21 |
-
C[Evaluator] --> D
|
| 22 |
-
D --> E[Agent Variants/Candidates]
|
| 23 |
-
E --> F[Pareto Graph]
|
| 24 |
-
```
|
| 25 |
-
|
| 26 |
-
## Core Concepts
|
| 27 |
-
|
| 28 |
-
| Component | What It Is |
|
| 29 |
-
| -------------- | ----------------------------------------------------------------------------------- |
|
| 30 |
-
| **Agent Spec** | Agent configuration (model, tools, compaction, instructions) with pluggable runtime |
|
| 31 |
-
| **Task** | A coding challenge with evaluation criteria |
|
| 32 |
-
| **Evaluator** | Scores agent output (LLM-as-Judge, heuristics, or trace-based) |
|
| 33 |
-
| **Optimizer** | Generates variants and runs experiments (GridSearch, extensible) |
|
| 34 |
-
|
| 35 |
-
## Quick Start
|
| 36 |
-
|
| 37 |
-
### 1. Install
|
| 38 |
-
|
| 39 |
-
```bash
|
| 40 |
-
git clone https://github.com/victordibia/flow
|
| 41 |
-
cd flow
|
| 42 |
-
uv sync
|
| 43 |
-
```
|
| 44 |
-
|
| 45 |
-
### 2. Configure
|
| 46 |
-
|
| 47 |
-
Create a `.env` file in the project root:
|
| 48 |
-
|
| 49 |
-
```bash
|
| 50 |
-
AZURE_OPENAI_API_KEY=your-api-key-here
|
| 51 |
-
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/
|
| 52 |
-
AZURE_OPENAI_CHAT_DEPLOYMENT_NAME=gpt-4o-mini
|
| 53 |
-
```
|
| 54 |
-
|
| 55 |
-
**Important:** Make sure your Azure OpenAI deployment has adequate rate limits:
|
| 56 |
-
- **Minimum:** 10,000 tokens per minute (TPM)
|
| 57 |
-
- **Recommended:** 30,000+ TPM for optimization runs
|
| 58 |
-
|
| 59 |
-
See [Azure Portal](https://portal.azure.com) → Your OpenAI resource → Deployments to adjust rate limits.
|
| 60 |
-
|
| 61 |
-
### 3. Test Your Setup
|
| 62 |
-
|
| 63 |
-
Before running optimization, verify your Azure OpenAI connection:
|
| 64 |
-
|
| 65 |
-
```bash
|
| 66 |
-
# Test Azure OpenAI connection
|
| 67 |
-
uv run python scripts/test_azure_connection.py
|
| 68 |
-
|
| 69 |
-
# Test basic agent execution
|
| 70 |
-
uv run python scripts/test_basic_agent.py
|
| 71 |
-
|
| 72 |
-
# Test LLM evaluator
|
| 73 |
-
uv run python scripts/test_evaluator.py
|
| 74 |
-
```
|
| 75 |
-
|
| 76 |
-
All tests should pass with non-zero scores and token counts.
|
| 77 |
-
|
| 78 |
-
### 4. Run
|
| 79 |
-
|
| 80 |
-
```bash
|
| 81 |
-
# Launch the web UI
|
| 82 |
-
uv run flow serve
|
| 83 |
-
|
| 84 |
-
# Or run optimization from CLI (base agent + variations + tasks)
|
| 85 |
-
uv run flow optimize --agent base.yaml --vary compaction,memory --tasks tasks.jsonl
|
| 86 |
-
```
|
| 87 |
-
|
| 88 |
-
## Agent Spec
|
| 89 |
-
|
| 90 |
-
Define your agent configuration:
|
| 91 |
-
|
| 92 |
-
```python
|
| 93 |
-
from flow.experiments.models import Agent, CompactionConfig
|
| 94 |
-
|
| 95 |
-
agent = Agent(
|
| 96 |
-
name="my-agent",
|
| 97 |
-
framework="maf", # default; extensible to other runtimes
|
| 98 |
-
instructions="You are a coding assistant",
|
| 99 |
-
tools="standard", # or "minimal", "full", "readonly"
|
| 100 |
-
compaction=CompactionConfig.head_tail(10, 40), # keep first 10 + last 40 messages
|
| 101 |
-
)
|
| 102 |
-
```
|
| 103 |
-
|
| 104 |
-
Flow tests variations like:
|
| 105 |
-
|
| 106 |
-
- **Compaction strategies** — `none`, `head_tail(N, M)`, `last_n(N)`
|
| 107 |
-
- **Tool configurations** — different tool sets
|
| 108 |
-
- **Instructions** — prompt variations
|
| 109 |
-
|
| 110 |
-
## Task Format
|
| 111 |
-
|
| 112 |
-
Tasks are JSONL with evaluation criteria:
|
| 113 |
-
|
| 114 |
-
```json
|
| 115 |
-
{
|
| 116 |
-
"name": "fizzbuzz",
|
| 117 |
-
"prompt": "Create fizzbuzz.py and run it",
|
| 118 |
-
"criteria": [
|
| 119 |
-
{ "name": "correct", "instruction": "Output shows FizzBuzz pattern" }
|
| 120 |
-
]
|
| 121 |
-
}
|
| 122 |
-
```
|
| 123 |
-
|
| 124 |
-
## Web UI
|
| 125 |
-
|
| 126 |
-
Launch with `uv run flow serve`. Create agents, import task suites, run optimization jobs, and view results with Pareto analysis. Test agents interactively with live trace streaming.
|
| 127 |
-
|
| 128 |
-
## CLI Commands
|
| 129 |
-
|
| 130 |
-
```bash
|
| 131 |
-
# Web UI
|
| 132 |
-
flow serve # Start the web UI
|
| 133 |
-
|
| 134 |
-
# Optimization
|
| 135 |
-
flow optimize --agent base.yaml --tasks tasks.jsonl # Optimize base agent
|
| 136 |
-
flow optimize --vary compaction,memory # Vary specific parameters
|
| 137 |
-
flow optimize --suite coding # Use built-in task suite
|
| 138 |
-
|
| 139 |
-
# Single Task Execution
|
| 140 |
-
flow run "Create hello.py" # Run a single task
|
| 141 |
-
flow run --config best.yaml "task" # Run with optimized config
|
| 142 |
-
|
| 143 |
-
# Testing & Diagnostics
|
| 144 |
-
python scripts/test_azure_connection.py # Test Azure OpenAI connection
|
| 145 |
-
python scripts/test_basic_agent.py # Test basic agent execution
|
| 146 |
-
python scripts/test_evaluator.py # Test LLM evaluator
|
| 147 |
-
```
|
| 148 |
-
|
| 149 |
-
## Optimizer
|
| 150 |
-
|
| 151 |
-
Flow includes multiple optimization strategies for finding the best agent configuration.
|
| 152 |
-
|
| 153 |
-
### Grid Search (Default)
|
| 154 |
-
|
| 155 |
-
Test predefined variations of your agent:
|
| 156 |
-
|
| 157 |
-
```bash
|
| 158 |
-
# Vary compaction and memory settings
|
| 159 |
-
flow optimize --agent examples/base_agent.yaml --vary compaction,memory --tasks examples/coding_tasks.jsonl
|
| 160 |
-
|
| 161 |
-
# Or define variations in a config file
|
| 162 |
-
flow optimize --config variations.yaml --agent base_agent.yaml --tasks tasks.jsonl
|
| 163 |
-
```
|
| 164 |
-
|
| 165 |
-
### GEPA (Active Learning)
|
| 166 |
-
|
| 167 |
-
Use GEPA (Generative Evolutionary Prompt Adjustment) for automatic prompt optimization:
|
| 168 |
-
|
| 169 |
-
```bash
|
| 170 |
-
# Run GEPA optimization
|
| 171 |
-
flow optimize \
|
| 172 |
-
--config examples/gepa_strategy.yaml \
|
| 173 |
-
--agent examples/base_agent.yaml \
|
| 174 |
-
--tasks examples/coding_tasks.jsonl \
|
| 175 |
-
--budget 10 \
|
| 176 |
-
--parallel 2
|
| 177 |
-
```
|
| 178 |
-
|
| 179 |
-
**GEPA Configuration:**
|
| 180 |
-
|
| 181 |
-
1. **Strategy Config** (`examples/gepa_strategy.yaml`):
|
| 182 |
-
```yaml
|
| 183 |
-
strategy_type: gepa
|
| 184 |
-
config:
|
| 185 |
-
reflection_lm: gpt-4o-mini # Model for GEPA's reflection
|
| 186 |
-
```
|
| 187 |
-
|
| 188 |
-
2. **Base Agent** (`examples/base_agent.yaml`):
|
| 189 |
-
```yaml
|
| 190 |
-
name: coding-assistant
|
| 191 |
-
model: gpt-4o-mini # Model for agent execution
|
| 192 |
-
tools: standard
|
| 193 |
-
instructions: |
|
| 194 |
-
Your initial prompt that GEPA will optimize...
|
| 195 |
-
```
|
| 196 |
-
|
| 197 |
-
3. **Run Optimization:**
|
| 198 |
-
- `--budget`: Number of optimization iterations (default: 10)
|
| 199 |
-
- `--parallel`: Concurrent evaluations (default: 4)
|
| 200 |
-
- Tasks must include evaluation criteria for LLM scoring
|
| 201 |
-
|
| 202 |
-
**Example Output:**
|
| 203 |
-
```
|
| 204 |
-
[1/10] coding-assistant_gepa_eval/fibonacci: ✓ score=0.85 tokens=1,245
|
| 205 |
-
[2/10] coding-assistant_gepa_eval/palindrome: ✓ score=0.78 tokens=982
|
| 206 |
-
...
|
| 207 |
-
Best agent exported to: ~/.flow/optimizations/<timestamp>/agents/best_score.yaml
|
| 208 |
-
```
|
| 209 |
-
|
| 210 |
-
### Requirements for Optimization
|
| 211 |
-
|
| 212 |
-
- **Azure OpenAI Deployment:** Create a deployment with your chosen model (e.g., `gpt-4o-mini`)
|
| 213 |
-
- **Rate Limits:** Minimum 10K TPM; 30K+ recommended for smooth runs
|
| 214 |
-
- **Task Criteria:** Tasks need evaluation criteria for LLM-based scoring:
|
| 215 |
-
```json
|
| 216 |
-
{
|
| 217 |
-
"name": "task_name",
|
| 218 |
-
"prompt": "Task description",
|
| 219 |
-
"criteria": [
|
| 220 |
-
{"name": "correctness", "instruction": "Solution is correct", "weight": 1.0},
|
| 221 |
-
{"name": "quality", "instruction": "Code is clean and documented", "weight": 0.7}
|
| 222 |
-
]
|
| 223 |
-
}
|
| 224 |
-
```
|
| 225 |
|
| 226 |
-
##
|
| 227 |
|
| 228 |
-
|
| 229 |
-
|
| 230 |
-
|
| 231 |
-
|
| 232 |
-
uv run ruff check src/ # Linting
|
| 233 |
-
```
|
| 234 |
|
| 235 |
-
##
|
| 236 |
|
| 237 |
-
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
title: Flow
|
| 3 |
+
emoji: 🌊
|
| 4 |
+
colorFrom: blue
|
| 5 |
+
colorTo: purple
|
| 6 |
+
sdk: docker
|
| 7 |
+
app_port: 7860
|
| 8 |
+
pinned: false
|
| 9 |
+
---
|
| 10 |
|
| 11 |
+
# Flow
|
|
|
|
| 12 |
|
| 13 |
Flow helps you find the best configuration for your AI coding agent. Define your agent spec, provide evaluation tasks, and Flow automatically generates variants, scores them, and shows you the quality vs. cost tradeoffs.
|
| 14 |
|
| 15 |
- **Simplified experimentation** — Automates the search for optimal agent configurations
|
| 16 |
- **Transparency** — See exactly what was tested, scores, and tradeoffs on a Pareto chart
|
| 17 |
- **User control** — Choose your tasks, evaluation criteria, and approve variants
|
| 18 |
+
- **Framework agnostic** — Standardized agent spec with pluggable runtime adapters
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 19 |
|
| 20 |
+
## Usage
|
| 21 |
|
| 22 |
+
1. Create or import an agent configuration
|
| 23 |
+
2. Define evaluation tasks with criteria
|
| 24 |
+
3. Run optimization to generate and test variants
|
| 25 |
+
4. View results on the Pareto chart (quality vs. cost)
|
|
|
|
|
|
|
| 26 |
|
| 27 |
+
## Links
|
| 28 |
|
| 29 |
+
- **GitHub**: [victordibia/flow](https://github.com/victordibia/flow)
|
| 30 |
+
- **Documentation**: See GitHub README for full documentation
|
src/flow/experiments/models.py
CHANGED
|
@@ -293,7 +293,9 @@ class Agent:
|
|
| 293 |
description: Human-readable description
|
| 294 |
instructions: System prompt / instructions (optional, uses framework default if None)
|
| 295 |
instructions_preset: Preset name for instructions ("coding", "benchmark", etc.)
|
| 296 |
-
|
|
|
|
|
|
|
| 297 |
compaction: Compaction strategy configuration
|
| 298 |
tools: Tool configuration - can be:
|
| 299 |
- str: Preset name ("standard", "minimal", "full", "readonly")
|
|
@@ -306,7 +308,7 @@ class Agent:
|
|
| 306 |
description: str = ""
|
| 307 |
instructions: str | None = None
|
| 308 |
instructions_preset: str | None = None # e.g., "coding", "benchmark", "research"
|
| 309 |
-
|
| 310 |
compaction: CompactionConfig = field(default_factory=CompactionConfig)
|
| 311 |
tools: str | list[str] | dict[str, dict[str, Any]] = "standard"
|
| 312 |
|
|
@@ -487,6 +489,11 @@ class GridSearchStrategy:
|
|
| 487 |
name_parts.append(f"tools=[{len(v)}]")
|
| 488 |
else:
|
| 489 |
name_parts.append(f"tools=[{len(v)}]")
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 490 |
elif isinstance(v, bool):
|
| 491 |
name_parts.append(f"{k}={'on' if v else 'off'}")
|
| 492 |
else:
|
|
|
|
| 293 |
description: Human-readable description
|
| 294 |
instructions: System prompt / instructions (optional, uses framework default if None)
|
| 295 |
instructions_preset: Preset name for instructions ("coding", "benchmark", etc.)
|
| 296 |
+
llm_config: LLM configuration with provider and model info:
|
| 297 |
+
{"provider": "azure|openai|anthropic", "model": "gpt-4o"}
|
| 298 |
+
If None, auto-detects from environment variables.
|
| 299 |
compaction: Compaction strategy configuration
|
| 300 |
tools: Tool configuration - can be:
|
| 301 |
- str: Preset name ("standard", "minimal", "full", "readonly")
|
|
|
|
| 308 |
description: str = ""
|
| 309 |
instructions: str | None = None
|
| 310 |
instructions_preset: str | None = None # e.g., "coding", "benchmark", "research"
|
| 311 |
+
llm_config: dict[str, Any] | None = None # {"provider": "azure", "model": "gpt-4o"}
|
| 312 |
compaction: CompactionConfig = field(default_factory=CompactionConfig)
|
| 313 |
tools: str | list[str] | dict[str, dict[str, Any]] = "standard"
|
| 314 |
|
|
|
|
| 489 |
name_parts.append(f"tools=[{len(v)}]")
|
| 490 |
else:
|
| 491 |
name_parts.append(f"tools=[{len(v)}]")
|
| 492 |
+
elif k == "llm_config" and isinstance(v, dict):
|
| 493 |
+
# Format llm_config as provider/model
|
| 494 |
+
provider = v.get("provider", "unknown")
|
| 495 |
+
model = v.get("model", "")
|
| 496 |
+
name_parts.append(f"{provider}/{model}" if model else provider)
|
| 497 |
elif isinstance(v, bool):
|
| 498 |
name_parts.append(f"{k}={'on' if v else 'off'}")
|
| 499 |
else:
|
src/flow/harness/langgraph/harness.py
CHANGED
|
@@ -77,8 +77,8 @@ class LangGraphHarness(BaseHarness):
|
|
| 77 |
memory_path.mkdir(parents=True, exist_ok=True)
|
| 78 |
tools = build_langgraph_tools(tools_spec, workspace, memory_path)
|
| 79 |
|
| 80 |
-
# Create model
|
| 81 |
-
model = cls._create_model(agent.
|
| 82 |
|
| 83 |
# Create compaction hook if enabled
|
| 84 |
pre_model_hook = None
|
|
@@ -100,22 +100,46 @@ class LangGraphHarness(BaseHarness):
|
|
| 100 |
return cls(graph=graph, agent_name=agent.name, workspace=workspace)
|
| 101 |
|
| 102 |
@staticmethod
|
| 103 |
-
def _create_model(
|
| 104 |
-
"""Create a LangChain chat model from
|
| 105 |
|
| 106 |
Args:
|
| 107 |
-
|
| 108 |
|
| 109 |
Returns:
|
| 110 |
A LangChain chat model instance
|
| 111 |
"""
|
| 112 |
import os
|
| 113 |
|
| 114 |
-
if
|
| 115 |
-
|
| 116 |
-
|
| 117 |
|
| 118 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 119 |
|
| 120 |
# Default: Azure OpenAI from environment
|
| 121 |
from langchain_openai import AzureChatOpenAI
|
|
|
|
| 77 |
memory_path.mkdir(parents=True, exist_ok=True)
|
| 78 |
tools = build_langgraph_tools(tools_spec, workspace, memory_path)
|
| 79 |
|
| 80 |
+
# Create model from llm_config
|
| 81 |
+
model = cls._create_model(agent.llm_config)
|
| 82 |
|
| 83 |
# Create compaction hook if enabled
|
| 84 |
pre_model_hook = None
|
|
|
|
| 100 |
return cls(graph=graph, agent_name=agent.name, workspace=workspace)
|
| 101 |
|
| 102 |
@staticmethod
|
| 103 |
+
def _create_model(llm_config: dict[str, Any] | None):
|
| 104 |
+
"""Create a LangChain chat model from llm_config.
|
| 105 |
|
| 106 |
Args:
|
| 107 |
+
llm_config: LLM config dict with provider and model keys
|
| 108 |
|
| 109 |
Returns:
|
| 110 |
A LangChain chat model instance
|
| 111 |
"""
|
| 112 |
import os
|
| 113 |
|
| 114 |
+
if llm_config:
|
| 115 |
+
provider = llm_config.get("provider", "").lower()
|
| 116 |
+
model = llm_config.get("model", "gpt-4o")
|
| 117 |
|
| 118 |
+
if provider == "openai":
|
| 119 |
+
from langchain_openai import ChatOpenAI
|
| 120 |
+
|
| 121 |
+
return ChatOpenAI(
|
| 122 |
+
model=model,
|
| 123 |
+
api_key=os.environ.get("OPENAI_API_KEY"),
|
| 124 |
+
)
|
| 125 |
+
|
| 126 |
+
elif provider in ("azure", "azure_openai"):
|
| 127 |
+
from langchain_openai import AzureChatOpenAI
|
| 128 |
+
|
| 129 |
+
return AzureChatOpenAI(
|
| 130 |
+
deployment_name=model,
|
| 131 |
+
api_key=os.environ.get("AZURE_OPENAI_API_KEY"),
|
| 132 |
+
azure_endpoint=os.environ.get("AZURE_OPENAI_ENDPOINT"),
|
| 133 |
+
api_version=os.environ.get("AZURE_OPENAI_API_VERSION", "2024-02-15-preview"),
|
| 134 |
+
)
|
| 135 |
+
|
| 136 |
+
elif provider == "anthropic":
|
| 137 |
+
from langchain_anthropic import ChatAnthropic
|
| 138 |
+
|
| 139 |
+
return ChatAnthropic(
|
| 140 |
+
model=model,
|
| 141 |
+
api_key=os.environ.get("ANTHROPIC_API_KEY"),
|
| 142 |
+
)
|
| 143 |
|
| 144 |
# Default: Azure OpenAI from environment
|
| 145 |
from langchain_openai import AzureChatOpenAI
|
src/flow/harness/miniagent/harness.py
CHANGED
|
@@ -87,23 +87,24 @@ class MiniAgentHarness(BaseHarness):
|
|
| 87 |
tools_spec = resolve_tools(agent.tools)
|
| 88 |
tools = cls._build_tools(tools_spec, workspace)
|
| 89 |
|
| 90 |
-
# 3. Create
|
| 91 |
-
from .otel import create_otel_hooks
|
| 92 |
-
otel_hooks = create_otel_hooks(model=agent.model or "gpt-4o")
|
| 93 |
-
|
| 94 |
-
# 4. Create ChatClient from LLM config or env
|
| 95 |
from .client import ClientConfig
|
| 96 |
if llm_config is not None:
|
| 97 |
-
# Use provided LLM config
|
| 98 |
config = cls._create_client_config_from_llm_config(llm_config)
|
|
|
|
|
|
|
|
|
|
| 99 |
else:
|
| 100 |
-
# Fall back to env vars
|
| 101 |
config = ClientConfig.from_env()
|
| 102 |
-
if agent.model:
|
| 103 |
-
config.model = agent.model
|
| 104 |
|
| 105 |
chat_client = ChatClient(config)
|
| 106 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 107 |
# Resolve instructions: explicit > preset > default "coding"
|
| 108 |
if agent.instructions:
|
| 109 |
instructions = agent.instructions
|
|
@@ -173,6 +174,104 @@ class MiniAgentHarness(BaseHarness):
|
|
| 173 |
f"Supported: openai, azure_openai, custom"
|
| 174 |
)
|
| 175 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 176 |
@classmethod
|
| 177 |
def _create_context_strategy(cls, agent: "Agent") -> ContextStrategy:
|
| 178 |
"""Map Flow's CompactionConfig to MiniAgent's ContextStrategy."""
|
|
|
|
| 87 |
tools_spec = resolve_tools(agent.tools)
|
| 88 |
tools = cls._build_tools(tools_spec, workspace)
|
| 89 |
|
| 90 |
+
# 3. Create ChatClient from LLM config or env
|
|
|
|
|
|
|
|
|
|
|
|
|
| 91 |
from .client import ClientConfig
|
| 92 |
if llm_config is not None:
|
| 93 |
+
# Use provided LLM config (from Flow's config system)
|
| 94 |
config = cls._create_client_config_from_llm_config(llm_config)
|
| 95 |
+
elif agent.llm_config:
|
| 96 |
+
# Use agent's llm_config dict
|
| 97 |
+
config = cls._create_client_config_from_dict(agent.llm_config)
|
| 98 |
else:
|
| 99 |
+
# Fall back to env vars auto-detection
|
| 100 |
config = ClientConfig.from_env()
|
|
|
|
|
|
|
| 101 |
|
| 102 |
chat_client = ChatClient(config)
|
| 103 |
|
| 104 |
+
# 4. Create OTEL hooks for trace collection
|
| 105 |
+
from .otel import create_otel_hooks
|
| 106 |
+
otel_hooks = create_otel_hooks(model=config.model)
|
| 107 |
+
|
| 108 |
# Resolve instructions: explicit > preset > default "coding"
|
| 109 |
if agent.instructions:
|
| 110 |
instructions = agent.instructions
|
|
|
|
| 174 |
f"Supported: openai, azure_openai, custom"
|
| 175 |
)
|
| 176 |
|
| 177 |
+
@classmethod
|
| 178 |
+
def _create_client_config_from_dict(
|
| 179 |
+
cls, llm_config: dict[str, Any]
|
| 180 |
+
) -> "ClientConfig":
|
| 181 |
+
"""Create ClientConfig from agent's llm_config dict.
|
| 182 |
+
|
| 183 |
+
Supports a simple format for YAML configuration:
|
| 184 |
+
llm_config:
|
| 185 |
+
provider: azure # or openai, anthropic
|
| 186 |
+
model: gpt-4o # model/deployment name
|
| 187 |
+
|
| 188 |
+
Reads credentials from environment variables based on provider.
|
| 189 |
+
|
| 190 |
+
Args:
|
| 191 |
+
llm_config: Dict with 'provider' and 'model' keys
|
| 192 |
+
|
| 193 |
+
Returns:
|
| 194 |
+
ClientConfig for the specified provider
|
| 195 |
+
|
| 196 |
+
Raises:
|
| 197 |
+
ValueError: If required fields or env vars are missing
|
| 198 |
+
"""
|
| 199 |
+
import os
|
| 200 |
+
from .client import ClientConfig
|
| 201 |
+
|
| 202 |
+
provider = llm_config.get("provider", "").lower()
|
| 203 |
+
model = llm_config.get("model")
|
| 204 |
+
|
| 205 |
+
if not provider:
|
| 206 |
+
raise ValueError("llm_config requires 'provider' field")
|
| 207 |
+
|
| 208 |
+
if provider in ("azure", "azure_openai"):
|
| 209 |
+
# Azure OpenAI - requires endpoint and deployment
|
| 210 |
+
endpoint = llm_config.get("endpoint") or os.environ.get("AZURE_OPENAI_ENDPOINT")
|
| 211 |
+
api_key = llm_config.get("api_key") or os.environ.get("AZURE_OPENAI_API_KEY")
|
| 212 |
+
deployment = model or os.environ.get("AZURE_OPENAI_DEPLOYMENT", "gpt-4o")
|
| 213 |
+
api_version = llm_config.get("api_version") or os.environ.get(
|
| 214 |
+
"AZURE_OPENAI_API_VERSION", "2024-02-15-preview"
|
| 215 |
+
)
|
| 216 |
+
|
| 217 |
+
if not endpoint:
|
| 218 |
+
raise ValueError(
|
| 219 |
+
"AZURE_OPENAI_ENDPOINT env var required for azure provider"
|
| 220 |
+
)
|
| 221 |
+
if not api_key:
|
| 222 |
+
raise ValueError(
|
| 223 |
+
"AZURE_OPENAI_API_KEY env var required for azure provider"
|
| 224 |
+
)
|
| 225 |
+
|
| 226 |
+
return ClientConfig(
|
| 227 |
+
api_key=api_key,
|
| 228 |
+
model=deployment,
|
| 229 |
+
endpoint=endpoint,
|
| 230 |
+
api_version=api_version,
|
| 231 |
+
)
|
| 232 |
+
|
| 233 |
+
elif provider == "openai":
|
| 234 |
+
# Standard OpenAI
|
| 235 |
+
api_key = llm_config.get("api_key") or os.environ.get("OPENAI_API_KEY")
|
| 236 |
+
model_name = model or os.environ.get("OPENAI_MODEL", "gpt-4o")
|
| 237 |
+
base_url = llm_config.get("base_url")
|
| 238 |
+
|
| 239 |
+
if not api_key:
|
| 240 |
+
raise ValueError(
|
| 241 |
+
"OPENAI_API_KEY env var required for openai provider"
|
| 242 |
+
)
|
| 243 |
+
|
| 244 |
+
return ClientConfig(
|
| 245 |
+
api_key=api_key,
|
| 246 |
+
model=model_name,
|
| 247 |
+
endpoint=base_url,
|
| 248 |
+
)
|
| 249 |
+
|
| 250 |
+
elif provider == "anthropic":
|
| 251 |
+
# Anthropic Claude - use OpenAI-compatible endpoint
|
| 252 |
+
api_key = llm_config.get("api_key") or os.environ.get("ANTHROPIC_API_KEY")
|
| 253 |
+
model_name = model or "claude-3-5-sonnet-20241022"
|
| 254 |
+
base_url = llm_config.get("base_url") or os.environ.get(
|
| 255 |
+
"ANTHROPIC_BASE_URL", "https://api.anthropic.com/v1"
|
| 256 |
+
)
|
| 257 |
+
|
| 258 |
+
if not api_key:
|
| 259 |
+
raise ValueError(
|
| 260 |
+
"ANTHROPIC_API_KEY env var required for anthropic provider"
|
| 261 |
+
)
|
| 262 |
+
|
| 263 |
+
return ClientConfig(
|
| 264 |
+
api_key=api_key,
|
| 265 |
+
model=model_name,
|
| 266 |
+
endpoint=base_url,
|
| 267 |
+
)
|
| 268 |
+
|
| 269 |
+
else:
|
| 270 |
+
raise ValueError(
|
| 271 |
+
f"Unknown provider: {provider}. "
|
| 272 |
+
f"Supported: azure, openai, anthropic"
|
| 273 |
+
)
|
| 274 |
+
|
| 275 |
@classmethod
|
| 276 |
def _create_context_strategy(cls, agent: "Agent") -> ContextStrategy:
|
| 277 |
"""Map Flow's CompactionConfig to MiniAgent's ContextStrategy."""
|
src/flow/ui/schemas/config.py
CHANGED
|
@@ -15,6 +15,13 @@ class CompactionConfigSchema(BaseModel):
|
|
| 15 |
params: dict[str, Any] = {"head_size": 10, "tail_size": 40}
|
| 16 |
|
| 17 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 18 |
class AgentCreate(BaseModel):
|
| 19 |
"""Request schema for creating an agent.
|
| 20 |
|
|
@@ -33,7 +40,7 @@ class AgentCreate(BaseModel):
|
|
| 33 |
description: str = ""
|
| 34 |
framework: str = "maf"
|
| 35 |
instructions: str | None = None
|
| 36 |
-
|
| 37 |
compaction: CompactionConfigSchema = CompactionConfigSchema()
|
| 38 |
tools: str | list[str] | dict[str, dict[str, Any]] = "standard"
|
| 39 |
|
|
@@ -42,7 +49,7 @@ class AgentCreate(BaseModel):
|
|
| 42 |
return {
|
| 43 |
"framework": self.framework,
|
| 44 |
"instructions": self.instructions,
|
| 45 |
-
"
|
| 46 |
"compaction": self.compaction.model_dump(),
|
| 47 |
"tools": self.tools,
|
| 48 |
}
|
|
@@ -55,7 +62,7 @@ class AgentUpdate(BaseModel):
|
|
| 55 |
description: str | None = None
|
| 56 |
framework: str | None = None
|
| 57 |
instructions: str | None = None
|
| 58 |
-
|
| 59 |
compaction: CompactionConfigSchema | None = None
|
| 60 |
tools: str | list[str] | dict[str, dict[str, Any]] | None = None
|
| 61 |
is_public: bool | None = None
|
|
|
|
| 15 |
params: dict[str, Any] = {"head_size": 10, "tail_size": 40}
|
| 16 |
|
| 17 |
|
| 18 |
+
class LLMConfigSchema(BaseModel):
|
| 19 |
+
"""LLM configuration with provider and model."""
|
| 20 |
+
|
| 21 |
+
provider: str = "azure" # azure, openai, anthropic
|
| 22 |
+
model: str = "gpt-4o"
|
| 23 |
+
|
| 24 |
+
|
| 25 |
class AgentCreate(BaseModel):
|
| 26 |
"""Request schema for creating an agent.
|
| 27 |
|
|
|
|
| 40 |
description: str = ""
|
| 41 |
framework: str = "maf"
|
| 42 |
instructions: str | None = None
|
| 43 |
+
llm_config: LLMConfigSchema | None = None
|
| 44 |
compaction: CompactionConfigSchema = CompactionConfigSchema()
|
| 45 |
tools: str | list[str] | dict[str, dict[str, Any]] = "standard"
|
| 46 |
|
|
|
|
| 49 |
return {
|
| 50 |
"framework": self.framework,
|
| 51 |
"instructions": self.instructions,
|
| 52 |
+
"llm_config": self.llm_config.model_dump() if self.llm_config else None,
|
| 53 |
"compaction": self.compaction.model_dump(),
|
| 54 |
"tools": self.tools,
|
| 55 |
}
|
|
|
|
| 62 |
description: str | None = None
|
| 63 |
framework: str | None = None
|
| 64 |
instructions: str | None = None
|
| 65 |
+
llm_config: LLMConfigSchema | None = None
|
| 66 |
compaction: CompactionConfigSchema | None = None
|
| 67 |
tools: str | list[str] | dict[str, dict[str, Any]] | None = None
|
| 68 |
is_public: bool | None = None
|