Spaces:
Running on Zero
A newer version of the Gradio SDK is available: 6.19.0
Template How-To: Build A New Domain App
This repository is a local-first Gradio AI app template. The base workbench provides shared patterns for model configuration, field notes, tracking, export planning, tests, docs, and deployment. A domain app is a focused product built around those patterns.
Use plant/ as the first reference domain app.
Core Principle
Do not start by training a model. Start by shipping a useful zero-shot or demo-mode workflow:
domain idea
-> user story
-> schema
-> model choice
-> focused UI
-> correction loop
-> export data
-> optional fine-tune
-> deploy and document
Training is a later optimization after you have corrected examples and a reason to tune.
Recommended Branch Flow
Keep
mainas the reusable template.Create a branch for each app:
git checkout -b plant-discovery-appBuild the app under a domain folder such as
plant/,invoice/,recipe/, orfield_notes/.Keep domain-specific heavy requirements in
<domain>/requirements.txt.Merge reusable improvements back into
mainonly after they are generic.
Domain App File Contract
Each generated app should have these files:
<domain>/
__init__.py
app.py # standalone Gradio entrypoint
models.yaml # domain config, model IDs, data sources, training defaults
<domain>_service.py # optional real model adapter plus demo/no-model fallback
<domain>_loader.py # data loading, schema normalization, export rows
<domain>_tab.py # focused Gradio UI
<domain>_tools.py # optional MCP/local tools with no hard optional imports
requirements.txt # optional heavy dependencies for this app only
Add tests under:
tests/unit/test_<domain>_reference_app.py
Add docs under:
docs/<DOMAIN>_APP_PLAN.md
Step-By-Step Build Process
1. Define The Product
- Pick one user.
- Pick one job they need done.
- Write one sentence: "This app helps X do Y without Z."
- Choose one golden path that works in under two minutes.
- Decide whether the app is a standalone product or a tab inside the workbench.
- Decide whether it must run on a public Hugging Face Space.
Example:
Plant Discovery helps gardeners identify a plant from a photo, correct mistakes, and export local training examples without sending private field notes to a cloud API.
2. Define The Domain Schema
- Create a dataclass for the structured output.
- Include confidence and model metadata.
- Include a
to_dict()method for Gradio JSON. - Add a robust parser for model responses.
- Add tests for valid JSON, fenced JSON, trailing commas, and unparseable text.
Plant example: PlantID in plant/plant_service.py.
3. Pick The Model
- Pick a small model at or below 32B parameters.
- Document the exact model ID.
- Add model metadata to
<domain>/models.yaml. - Avoid loading weights on startup.
- Add a deterministic demo/no-model service for screenshots and tests.
- Add an unavailable-path response when optional packages are missing.
- Add explicit runtime modes such as
demo,base-model, andfinetuned. - Do not claim a fine-tuned model until a real adapter/checkpoint is configured and verified.
For vision apps, start with a VLM such as MiniCPM-V. For text apps, start with a small instruct model through LM Studio, Ollama, llama.cpp, or Transformers.
4. Build The Focused UI
- Make the first screen the golden path, not a generic dashboard.
- Add only the controls needed for the user story.
- Keep advanced setup behind a secondary tab or accordion.
- Add visible status messages.
- Add structured JSON output for debugging and reproducibility.
- Add correction capture if model output can be wrong.
- Add screenshots through Playwright after the UI is stable.
5. Add The Correction Loop
- Save user corrections locally.
- Reuse
datasets.field_notes.FieldNoteStorewhere possible. - Mark training-ready rows explicitly.
- Export JSONL without starting training.
- Add tests for save, filter, and export.
6. Add Data Loaders
- Support a small local demo dataset.
- Support domain data from local folders or CSV/JSONL.
- Keep Hugging Face dataset loading optional and explicit.
- Do not download large datasets on startup.
- Normalize every source into one training row schema.
- Add loader tests with temporary local files.
7. Add Optional Tools
- Keep MCP/tool imports optional.
- Tool functions should work locally without starting a server.
- Add
build_mcp_server()only ifmcpis installed. - Avoid direct shell execution from tools.
- Return command plans rather than running commands.
- Add tests for pure tool functions.
8. Add Training Plans
- Start with a non-executing training plan.
- Include required dependencies, hardware notes, and command preview.
- Require enough corrected examples before recommending training.
- Keep real training as a separate local command or approved action.
- Add evaluation before/after tuning.
- Add a small script that prints the training plan as JSON.
9. Add Security Guardrails
- Escape model text rendered as HTML.
- Restrict file paths in public Space mode.
- Disable arbitrary backend URL checks in public Space mode.
- Do not execute subprocesses from Gradio callbacks.
- Keep tokens, private data, model weights, and exports out of git.
- Add tests for path traversal and malformed inputs when public deployment is planned.
10. Verify The App
Minimum local verification:
.venv\Scripts\python.exe -m pytest tests/unit/test_<domain>_reference_app.py -q
.venv\Scripts\ruff.exe check <domain> tests/unit/test_<domain>_reference_app.py --no-cache
.venv\Scripts\python.exe -m mypy <domain> tests/unit/test_<domain>_reference_app.py --cache-dir "$env:TEMP\openbmb-workbench-mypy-cache"
.venv\Scripts\python.exe -c "from <domain>.app import build_app; app=build_app(no_model=True); print(type(app).__name__)"
Before claiming it works:
- Run the standalone app.
- Generate screenshots.
- Add screenshot links to README/docs.
- Run full quality checks.
- Commit and push.
When To Integrate Into The Main Workbench
Keep the domain app standalone if:
- it has its own brand/story,
- it needs a focused judging experience,
- it has domain-specific dependencies,
- it should become a Hugging Face Space.
Add it to the main workbench only if:
- it is a generic reusable tab,
- it does not add heavy dependencies,
- it strengthens the template for all future apps.
For the hackathon, standalone plant/ is the better route because judges need one clear product.
What "Done" Means For A Domain App
- Standalone no-model app builds.
- Optional real model adapter is documented and lazy-loaded.
- Golden path has tests.
- Corrections export to training data.
- Training is planned, not accidentally executed.
- Screenshots are generated.
- README explains setup, model choice, demo flow, and limitations.
- Space deployment is verified or blocker is documented.