Spaces:
Sleeping
Sleeping
File size: 3,639 Bytes
62c5bbf 2930dae 62c5bbf 2930dae 62c5bbf 2930dae 62c5bbf 2930dae 62c5bbf 2930dae 62c5bbf 2930dae 62c5bbf 2930dae 62c5bbf 2930dae 62c5bbf 2930dae 62c5bbf 2930dae 62c5bbf 2930dae 62c5bbf 2930dae 62c5bbf 2930dae 62c5bbf 2930dae 62c5bbf 2930dae 62c5bbf 2930dae 62c5bbf 2930dae 62c5bbf 2930dae 62c5bbf 2930dae 62c5bbf 2930dae 62c5bbf 2930dae 62c5bbf 2930dae 62c5bbf 2930dae 62c5bbf 2930dae 62c5bbf 2930dae 62c5bbf 2930dae 62c5bbf 2930dae 62c5bbf 2930dae 62c5bbf 2930dae 62c5bbf | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 | # setup.md - SupportEnv Validator-Focused Runbook
## 1. What judges/validator execute
Most checks align to this flow:
1. `POST /reset` on the deployed Space
2. `docker build` from repo root
3. `openenv validate`
4. endpoint contract checks for `/health`, `/reset`, `/step`, `/state`, `/grader`
5. `python inference.py` and stdout format check for `[START]`, `[STEP]`, `[END]`
## 2. File-by-file usage (root)
- `app.py`: FastAPI API surface (`/reset`, `/step`, `/state`, `/tasks`, `/grader`, `/health`)
- `environment.py`: episode lifecycle and reward accumulation (`reset`, `step`, `get_state`, `grade`)
- `graders.py`: deterministic terminal scoring per task with score clamped to `[0.0, 1.0]`
- `data.py`: task metadata and ticket datasets with ground truth labels/entities/steps
- `models.py`: typed Pydantic models used by API and internal state
- `inference.py`: baseline runner; calls the API, logs strict `[START]/[STEP]/[END]`
- `openenv.yaml`: OpenEnv metadata and interface declaration used by validator
- `Dockerfile`: image build/runtime contract for HF Docker Spaces (serves on `7860`)
- `requirements.txt`: runtime dependencies
- `pyproject.toml`: packaging metadata + script entrypoint expected by validator tooling
- `uv.lock`: lockfile required by OpenEnv multi-mode validation path
- `server/app.py`: validator-friendly script entrypoint (`server = server.app:main`)
## 3. Local setup
### Windows PowerShell
```powershell
python -m venv .venv
.venv\Scripts\Activate.ps1
pip install -r requirements.txt
```
### macOS/Linux
```bash
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
```
## 4. Validation checklist (exact order)
1. OpenEnv validator
```bash
.venv/Scripts/openenv.exe validate
```
2. Docker build
```bash
docker build -t supportenv .
```
3. Run server locally
```bash
uvicorn app:app --host 0.0.0.0 --port 7860
```
4. API checks
```bash
curl http://127.0.0.1:7860/health
curl -X POST http://127.0.0.1:7860/reset -H "Content-Type: application/json" -d '{"task_id":"task1","ticket_index":0}'
curl -X POST http://127.0.0.1:7860/step -H "Content-Type: application/json" -d '{"episode_id":"<id>","action":{"action_type":"classify","category":"billing","priority":"high"}}'
curl -X POST http://127.0.0.1:7860/state?episode_id=<id>
curl -X POST http://127.0.0.1:7860/grader -H "Content-Type: application/json" -d '{"episode_id":"<id>"}'
```
5. Baseline inference
```bash
python inference.py
```
## 5. Docker and Spaces runtime model
- Build stage installs from `requirements.txt`.
- Runtime command runs Uvicorn: `app:app` on `0.0.0.0:7860`.
- HF Space should set `sdk: docker` and `app_port: 7860` in `README.md` frontmatter.
- Healthcheck points at `/health` to indicate container liveness.
- If Docker daemon is not running locally, `docker build`/`docker run` will fail even if repo is correct.
## 6. Inference variables
- Required for LLM call path:
- `API_BASE_URL`
- `MODEL_NAME`
- `HF_TOKEN`
- Environment endpoint:
- `OPENENV_BASE_URL` (preferred)
- `API_BASE_URL_ENV` (legacy alias)
## 7. Example scorer sanity checks
- Task 1: submit `classify` then `submit`, verify non-binary reward and final score in `[0, 1]`
- Task 2: include deterministic entity/action coverage keys from ticket text
- Task 3: include professional response plus ordered resolution steps
## 8. Common failure causes
- Missing `pyproject.toml` or `uv.lock`
- Missing script entrypoint (`server = server.app:main`)
- App not serving on `0.0.0.0:7860`
- Duplicate HF variable/secret names in Space settings
- Invalid or missing `HF_TOKEN` for real LLM inference |