workbench / docs /TEMPLATE_HOWTO.md
GitHub Actions
Initial ZeroGPU deployment with spaces shim
7f9dfed
|
Raw
History Blame Contribute Delete
7.32 kB

A newer version of the Gradio SDK is available: 6.19.0

Upgrade

Template How-To: Build A New Domain App

This repository is a local-first Gradio AI app template. The base workbench provides shared patterns for model configuration, field notes, tracking, export planning, tests, docs, and deployment. A domain app is a focused product built around those patterns.

Use plant/ as the first reference domain app.

Core Principle

Do not start by training a model. Start by shipping a useful zero-shot or demo-mode workflow:

domain idea
  -> user story
  -> schema
  -> model choice
  -> focused UI
  -> correction loop
  -> export data
  -> optional fine-tune
  -> deploy and document

Training is a later optimization after you have corrected examples and a reason to tune.

Recommended Branch Flow

  1. Keep main as the reusable template.

  2. Create a branch for each app:

    git checkout -b plant-discovery-app
    
  3. Build the app under a domain folder such as plant/, invoice/, recipe/, or field_notes/.

  4. Keep domain-specific heavy requirements in <domain>/requirements.txt.

  5. Merge reusable improvements back into main only after they are generic.

Domain App File Contract

Each generated app should have these files:

<domain>/
  __init__.py
  app.py              # standalone Gradio entrypoint
  models.yaml         # domain config, model IDs, data sources, training defaults
  <domain>_service.py # optional real model adapter plus demo/no-model fallback
  <domain>_loader.py  # data loading, schema normalization, export rows
  <domain>_tab.py     # focused Gradio UI
  <domain>_tools.py   # optional MCP/local tools with no hard optional imports
  requirements.txt    # optional heavy dependencies for this app only

Add tests under:

tests/unit/test_<domain>_reference_app.py

Add docs under:

docs/<DOMAIN>_APP_PLAN.md

Step-By-Step Build Process

1. Define The Product

  • Pick one user.
  • Pick one job they need done.
  • Write one sentence: "This app helps X do Y without Z."
  • Choose one golden path that works in under two minutes.
  • Decide whether the app is a standalone product or a tab inside the workbench.
  • Decide whether it must run on a public Hugging Face Space.

Example:

Plant Discovery helps gardeners identify a plant from a photo, correct mistakes, and export local training examples without sending private field notes to a cloud API.

2. Define The Domain Schema

  • Create a dataclass for the structured output.
  • Include confidence and model metadata.
  • Include a to_dict() method for Gradio JSON.
  • Add a robust parser for model responses.
  • Add tests for valid JSON, fenced JSON, trailing commas, and unparseable text.

Plant example: PlantID in plant/plant_service.py.

3. Pick The Model

  • Pick a small model at or below 32B parameters.
  • Document the exact model ID.
  • Add model metadata to <domain>/models.yaml.
  • Avoid loading weights on startup.
  • Add a deterministic demo/no-model service for screenshots and tests.
  • Add an unavailable-path response when optional packages are missing.
  • Add explicit runtime modes such as demo, base-model, and finetuned.
  • Do not claim a fine-tuned model until a real adapter/checkpoint is configured and verified.

For vision apps, start with a VLM such as MiniCPM-V. For text apps, start with a small instruct model through LM Studio, Ollama, llama.cpp, or Transformers.

4. Build The Focused UI

  • Make the first screen the golden path, not a generic dashboard.
  • Add only the controls needed for the user story.
  • Keep advanced setup behind a secondary tab or accordion.
  • Add visible status messages.
  • Add structured JSON output for debugging and reproducibility.
  • Add correction capture if model output can be wrong.
  • Add screenshots through Playwright after the UI is stable.

5. Add The Correction Loop

  • Save user corrections locally.
  • Reuse datasets.field_notes.FieldNoteStore where possible.
  • Mark training-ready rows explicitly.
  • Export JSONL without starting training.
  • Add tests for save, filter, and export.

6. Add Data Loaders

  • Support a small local demo dataset.
  • Support domain data from local folders or CSV/JSONL.
  • Keep Hugging Face dataset loading optional and explicit.
  • Do not download large datasets on startup.
  • Normalize every source into one training row schema.
  • Add loader tests with temporary local files.

7. Add Optional Tools

  • Keep MCP/tool imports optional.
  • Tool functions should work locally without starting a server.
  • Add build_mcp_server() only if mcp is installed.
  • Avoid direct shell execution from tools.
  • Return command plans rather than running commands.
  • Add tests for pure tool functions.

8. Add Training Plans

  • Start with a non-executing training plan.
  • Include required dependencies, hardware notes, and command preview.
  • Require enough corrected examples before recommending training.
  • Keep real training as a separate local command or approved action.
  • Add evaluation before/after tuning.
  • Add a small script that prints the training plan as JSON.

9. Add Security Guardrails

  • Escape model text rendered as HTML.
  • Restrict file paths in public Space mode.
  • Disable arbitrary backend URL checks in public Space mode.
  • Do not execute subprocesses from Gradio callbacks.
  • Keep tokens, private data, model weights, and exports out of git.
  • Add tests for path traversal and malformed inputs when public deployment is planned.

10. Verify The App

Minimum local verification:

.venv\Scripts\python.exe -m pytest tests/unit/test_<domain>_reference_app.py -q
.venv\Scripts\ruff.exe check <domain> tests/unit/test_<domain>_reference_app.py --no-cache
.venv\Scripts\python.exe -m mypy <domain> tests/unit/test_<domain>_reference_app.py --cache-dir "$env:TEMP\openbmb-workbench-mypy-cache"
.venv\Scripts\python.exe -c "from <domain>.app import build_app; app=build_app(no_model=True); print(type(app).__name__)"

Before claiming it works:

  • Run the standalone app.
  • Generate screenshots.
  • Add screenshot links to README/docs.
  • Run full quality checks.
  • Commit and push.

When To Integrate Into The Main Workbench

Keep the domain app standalone if:

  • it has its own brand/story,
  • it needs a focused judging experience,
  • it has domain-specific dependencies,
  • it should become a Hugging Face Space.

Add it to the main workbench only if:

  • it is a generic reusable tab,
  • it does not add heavy dependencies,
  • it strengthens the template for all future apps.

For the hackathon, standalone plant/ is the better route because judges need one clear product.

What "Done" Means For A Domain App

  • Standalone no-model app builds.
  • Optional real model adapter is documented and lazy-loaded.
  • Golden path has tests.
  • Corrections export to training data.
  • Training is planned, not accidentally executed.
  • Screenshots are generated.
  • README explains setup, model choice, demo flow, and limitations.
  • Space deployment is verified or blocker is documented.