Spaces:

build-small-hackathon
/

workbench

Running on Zero

App Files Files Community

workbench / docs /ARCHITECTURE.md

GitHub Actions

Initial ZeroGPU deployment with spaces shim

7f9dfed 15 days ago

preview code

Raw

History Blame Contribute Delete

21.7 kB

A newer version of the Gradio SDK is available: 6.19.0

Upgrade

Architecture

The project is intentionally small at first. The PRD describes a large workbench; this repo starts with the smallest version that can grow into it.

High-Level Flow

app.py
  loads config/models.yaml
  configures lightweight logging
  builds Gradio tabs
  passes model catalog to UI modules

ui/*
  defines each Gradio tab
  calls service classes
  emits local app events for inference, datasets, and field notes
  uses shared progress settings for callback loading indicators

agent/*
  holds deterministic local agent planning and trace export helpers

models/*
  holds model catalog, local backend config, and inference services

datasets/*
  stores dataset, synthetic data, and correction-loop helpers

mcp_tools/*
  holds local tool functions, VINDEX call planning, and Gradio-native MCP bridge metadata

config/*
  holds model and training settings

training/*
  holds non-executing training, LoRA request, evaluation, and export planning helpers

tracking/*
  holds local JSONL tracing and optional Trackio integration

deployment/*
  holds Hugging Face Space deployment planning and validation helpers

plant/*
  holds the first reference domain app built from the template
  can run standalone with python -m plant.app --no-model
  keeps heavy model dependencies optional

core/*
  shared app state, event, logging, and registry helpers

Files And Classes

`app.py`

Builds and launches the Gradio app.

build_app() creates the Gradio Blocks app.
Loads the model catalog from config/models.yaml.
Registers the current UI tabs.
APP_CSS defines compact responsive layout rules for app width, mobile padding, scrollable tabs, and button touch targets.

`plant/app.py`

Standalone Plant Discovery reference app built from the template.

build_app(no_model=True) creates a Gradio app without loading model weights.
Loads plant/models.yaml.
Builds a local species index.
Reuses datasets.field_notes.FieldNoteStore for corrections.
Uses DemoPlantVisionService for screenshots/tests or PlantVisionService for OpenBMB MiniCPM-V zero-shot and fine-tuned adapter inference.

`plant/plant_service.py`

Domain service and schema for Plant Discovery.

PlantID is the structured output schema.
DemoPlantVisionService provides deterministic no-model results.
PlantVisionService lazy-loads optional MiniCPM-V dependencies only during identification.
PlantVisionService.from_config(..., "plant_vlm_finetuned") can load a PEFT adapter after a real adapter repo is configured.
extract_json_object() and parse_plant_response() make model JSON output testable.

`plant/training.py`

Non-executing training planner for Plant Discovery.

build_plant_training_plan() returns SWIFT and LLaMA-Factory command previews.
plant_training_dependency_report() reports optional training dependency availability.
write_llamafactory_dataset_info() writes a dataset-info preview for LLaMA-Factory workflows.
Training is never started by the Gradio UI or script.

`plant/plant_loader.py`

Domain data and export helpers for Plant Discovery.

PlantRecord normalizes plant examples into training rows.
LocalFolderLoader maps species folders to image metadata.
SpeciesIndexBuilder builds a no-network species index with demo fallback.
FieldNotesPlantExporter exports corrected field notes to plant training JSONL.

`plant/plant_tab.py`

Focused Gradio UI for Plant Discovery.

Identify tab uploads images and renders a safe escaped result card.
Field Guide tab searches the species index.
Corrections tab saves and exports training-ready corrections.
Stats tab summarizes species and correction counts.
Training is represented as a non-executing plan, not a subprocess.

`plant/plant_tools.py`

Optional local/MCP tools for Plant Discovery.

Pure functions can be tested without an MCP server.
build_mcp_server() imports mcp only when explicitly requested.
Tools expose identify, species search, correction save/export, stats, and training plan.

`models/model_catalog.py`

Reads model configuration and turns it into typed Python objects.

ModelInfo describes one configured model.
load_model_catalog(path) reads YAML and returns all configured models.
model_choices(catalog, model_type) filters models for a UI dropdown.
model_summary(model) returns display metadata for the Gradio JSON panel.
backend_capabilities maps each model to supported local backend capabilities.

`models/placeholder_service.py`

Deterministic placeholder model service used before real inference is wired.

PlaceholderModelService.chat() returns a deterministic text response.
PlaceholderModelService.vision_chat() returns a deterministic image/prompt response.

This file should be replaced or complemented by real services such as:

ollama_service.py
llama_cpp_service.py
openai_compatible_service.py
sglang_runner.py
minicpm_vision.py
transformers_text.py
sglang_service.py

`models/base.py`

Defines service contracts and backend status records.

BackendStatus describes whether a backend is available.
TextModelService is the text chat protocol.
VisionModelService is the vision chat protocol.

`models/ollama_service.py`

Ollama-backed local inference client.

Checks whether ollama is installed and reachable.
Sends text and vision chat requests to http://127.0.0.1:11434/api/chat.
Lists locally available Ollama models through /api/tags.
Builds explicit ollama pull <model> commands for the Status tab.
Does not pull or download models automatically.

`models/llama_cpp_service.py`

llama.cpp HTTP client for local GGUF inference.

Checks whether llama-server is installed and reachable.
Builds explicit llama-server -m <model.gguf> commands.
Supports --mmproj <mmproj.gguf> command metadata for multimodal models.
Sends text chat requests to /v1/chat/completions.
Does not download GGUF files or start background servers automatically.

`models/local_backend_config.py`

User-local backend settings stored under ignored data/local_backends.yaml.

LocalBackendConfig stores llama.cpp server URL, OpenAI-compatible base URL, optional served model name, GGUF path, mmproj path, context length, and GPU layers.
save_local_backend_config() writes local-only settings without touching tracked model config.
build_llama_server_command() returns the explicit command the user can run.
local_backend_summary() reports file status and confirms no startup downloads or automatic model loads.

`models/openai_compatible_service.py`

Local OpenAI-compatible chat client for LM Studio, vLLM-style servers, or similar local endpoints.

Checks /v1/models for reachability.
Sends text chat requests to /v1/chat/completions.
Supports an optional served-model-name override for tools such as LM Studio.
Returns visible unavailable/request-failed messages instead of crashing the Gradio callback.
Does not call cloud APIs or download model weights.

`models/llama_cpp_python_service.py`

Optional direct Python binding backend for GGUF inference.

Checks whether llama_cpp is importable.
Requires an explicit local GGUF path.
Does not download model files.
Provides text chat through Llama.create_chat_completion().
Vision support remains routed through llama-server until mmproj/image serialization is wired.

`models/minicpm_vision.py`

Optional MiniCPM vision backend.

Checks whether the transformers package is available.
Lazy-loads AutoProcessor and AutoModelForImageTextToText only when selected.
Formats image/text messages for image-text-to-text generation.
Maps thinking mode into the prompt template.
Provides a video support plan for future local frame sampling.

`models/sglang_runner.py`

SGLang local server planner and OpenAI-compatible chat client.

Builds an explicit python -m sglang.launch_server command.
Includes MiniCPM tool parser configuration.
Checks /health, sends chat requests to /v1/chat/completions, and can request /shutdown.
Does not install SGLang, start a process, download model weights, or load a model on app startup.

`models/vllm_runner.py`

vLLM local server planner and OpenAI-compatible chat client.

Builds explicit vllm serve <model> command plans.
Checks /health, parses Prometheus-style /metrics, and sends chat requests to /v1/chat/completions.
Logs parsed benchmark metrics through TrackingClient.
Does not install vLLM, start a process, download model weights, or load a model on app startup.

`models/transformers_text.py`

Optional Transformers text backend.

Checks whether the transformers package is installed.
Lazy-loads AutoTokenizer and AutoModelForCausalLM only when the backend is selected.
Reads trust_remote_code, device map, dtype, max token, and temperature settings from explicit config.
Provides a simple token-list streaming helper for future Gradio streaming wiring.
Does not download model weights on startup.

`models/service_factory.py`

Creates the selected backend service for the UI.

TEXT_SERVICE_REGISTRY registers available text backend factories.
VISION_SERVICE_REGISTRY registers available vision backend factories.
create_text_service() chooses placeholder, llama.cpp, llama-cpp-python, Ollama, OpenAI-compatible, SGLang, or Transformers text service.
create_vision_service() chooses placeholder, llama.cpp, llama-cpp-python, Ollama, or Transformers MiniCPM vision service.
backend_statuses() reports current backend availability.
llama.cpp, llama-cpp-python, and OpenAI-compatible services read ignored local backend settings when selected.

`ui/chat_tab.py`

Builds the text chat tab.

Shows text models from the catalog.
Displays selected model metadata.
Calls the selected backend service.
Emits inference request and response events.

`ui/vision_tab.py`

Builds the vision tab.

Shows vision models from the catalog.
Accepts an image and prompt.
Calls the selected backend service.
Emits inference request and response events.

`ui/dataset_tab.py`

Local dataset preview surface.

Previews local CSV, JSONL, and NDJSON files.
Previews Hugging Face datasets when the optional external datasets package is installed.
Shows source, row count, columns, and sample rows.
Calculates basic local dataset statistics.
Emits dataset loaded events.

Future behavior:

Serve dataset tools through the selected MCP path.

`ui/train_tab.py`

Training planning and local evaluation surface.

Builds a LoRA dry-run training plan without launching training.
Builds a non-executing LoRA trainer request with dependency status.
Shows SWIFT/LLaMA-Factory vision fine-tuning plan.
Shows checkpoint output path, validation status, and hardware notes.
Runs local base-vs-tuned evaluation from newline-separated response text.
Shows exact-match summary and a qualitative eval table.
Logs tuned evaluation reports to data/eval_results.jsonl.

Future behavior:

Start LoRA training.
Show loss and metrics.
Write Trackio traces.

`ui/vllm_tab.py`

vLLM local serving planner.

Builds explicit vllm serve command plans.
Checks local vLLM /health.
Fetches and parses /metrics.
Logs vLLM benchmark metrics through local JSONL/Trackio fallback tracking.
Does not install vLLM, start a process, download models, or load weights on startup.

`ui/export_tab.py`

GGUF export planning surface.

Selects a configured model and quantization.
Shows official GGUF download command plans when the model has GGUF metadata.
Shows local HF-to-GGUF conversion and llama.cpp quantization command plans.
Lists files already present under the selected export directory.
Exposes existing exported files through a Gradio download output.
Does not execute downloads, conversion, or quantization.

Future behavior:

Execute downloads and conversions after explicit user action.

`ui/notes_tab.py`

Field notes implementation.

Saves prompt, model response, correction, and tags to data/field_notes.csv.
Captures optional image path, video path, and a use-for-training flag.
Exports corrected notes to JSONL.
Exports local Hugging Face Dataset-style files under data/hf_field_notes.
Imports uncertain OCR predictions for human correction.
Exports corrected OCR rows to JSONL.
Emits field note saved events.

Future behavior:

Push corrected notes to a remote Hugging Face Dataset after login.
Feed notes into fine-tuning.

`ui/traces_tab.py`

Local trace and tracking preview.

Shows manual trace event previews.
Shows recent local app events.
Shows JSONL trace rows and tracking status.
Exports local traces to exports/traces.jsonl.
Calls Trackio only when the optional package is installed and enabled.

`ui/agent_tab.py`

Local non-autonomous agent mode.

Drafts a research-plan-implement-verify trace.
Saves agent traces to data/agent_traces.jsonl.
Exports trace JSONL and local HF Dataset-style trace files.
Does not execute shell commands, commit, push, deploy, download models, or call external services.

`ui/status_tab.py`

Shows configured models and backend metadata.

Helps verify model-size compliance and backend status.
Provides local llama.cpp settings, GGUF/mmproj file pickers, and command generation.
Provides LM Studio/OpenAI-compatible base URL, optional model-name storage, and reachability check.
Provides SGLang command planning, health check, and shutdown request controls.

`datasets/field_notes.py`

Field note data model and CSV store.

FieldNote captures prompt, response, correction, tags, and timestamp.
FieldNote also captures optional image/video paths and a training inclusion flag.
FieldNoteStore.save() persists notes to data/field_notes.csv.
FieldNoteStore.list_notes() filters by correction, tag, and training inclusion.
FieldNoteStore.export_jsonl() writes training-ready JSONL.
FieldNoteStore.export_hf_dataset() writes local HF Dataset-style files.
SQLiteFieldNoteStore stores and lists notes in SQLite for larger correction loops.

`datasets/loader.py`

Dataset preview and statistics helpers.

preview_local_dataset() previews CSV, JSONL, and NDJSON files.
dataset_statistics() reports row count, column count, names, and non-empty counts.
preview_huggingface_dataset() optionally uses the external Hugging Face datasets package.

`datasets/synthetic.py`

Deterministic local synthetic data helpers.

generate_synthetic_examples() creates local prompt/response/correction examples.
validate_synthetic_example() checks schema requirements.
quality_filter_examples() removes incomplete or low-value examples.
augment_examples() creates deterministic variants for workflow testing.
export_synthetic_jsonl() writes JSONL without external services.

`datasets/ocr.py`

Local OCR correction helpers.

OCRPrediction stores source path, predicted text, confidence, and optional page.
load_ocr_predictions() loads local .csv, .jsonl, and .ndjson prediction files.
uncertain_predictions() filters rows at or below a confidence threshold or with empty text.
import_uncertain_predictions() creates Field Notes correction tasks for uncertain rows.
export_corrected_ocr_notes() writes corrected OCR examples to JSONL for evaluation or training.
ocr_import_summary() previews uncertain rows for the Field Notes tab.

`mcp_tools/tools.py`

Local MCP-style tools.

dataset_stats_tool() returns local dataset statistics.
hf_dataset_preview_tool() previews Hugging Face datasets when optional dependencies exist.
safe_calculator_tool() evaluates numeric arithmetic only.
model_inference_tool() routes text prompts through the selected model service.
tool_registry() returns the local tool map for a future MCP endpoint.

`mcp_tools/vindex_tool.py`

Non-executing VINDEX integration boundary.

Defines the eight VINDEX PRD methods and their local FastAPI paths.
build_vindex_call_plan() validates method names and builds endpoint/payload plans.
Caps star_spread.n_neighbors at 5 and calibrated_edit.causal_window at 3 based on the PRD safety notes.
vindex_dependency_report() checks whether the optional vindex package or local health endpoint is available.
vindex_verification_report() combines dependency status with a safe call plan and keeps execution disabled until the local VINDEX install is verified.

`mcp_tools/bridge.py`

Gradio-native MCP bridge metadata and local invocation helper.

MCP_PATH documents /gradio_api/mcp/sse.
mcp_manifest() returns the selected mode, path, and tool definitions.
invoke_mcp_tool() verifies local tool invocation by name.

`agent/runner.py`

Deterministic local agent trace runner.

AGENT_SYSTEM_PROMPT defines the agent behavior contract.
run_agent_loop() produces research, plan, implement, and verify trace steps.
run_paper_to_code_loop() produces paper-to-code research, plan, implement, and verify trace steps.
default_safety_gates() lists the non-autonomous safety requirements.
save_agent_trace() appends traces to JSONL.
export_agent_traces() exports trace JSONL.
export_agent_traces_hf_dataset() writes local HF Dataset-style trace files.
The runner can call safe local tools, but it is not autonomous.

`core/file_exports.py`

Shared export helper.

copy_text_file_or_empty() copies a text artifact to an export path or creates an empty one.

`training/export.py`

Non-executing GGUF export planning.

detect_llama_cpp_tools() checks llama-server, llama-cli, and llama-quantize.
build_export_plan() creates explicit download, conversion, and quantization command plans.
list_exported_files() lists generated/local export files.
ExportPlan.as_dict() marks that commands are not executed and no startup downloads occur.

`training/evaluation.py`

Local deterministic evaluation helpers.

default_prompt_cases() returns a small built-in prompt test set.
load_prompt_cases() loads prompt/expected pairs from JSONL.
evaluate_responses() computes exact-match rows and a qualitative table.
perplexity_from_losses() computes perplexity from explicit negative log likelihood values.
compare_base_vs_tuned() reports exact-match delta.
log_eval_report() appends JSONL evaluation results.

`training/lora_trainer.py`

Non-executing LoRA trainer request builder.

lora_dependency_report() reports PEFT, TRL, Transformers, and Torch availability.
build_lora_training_request() combines the training plan with dependency status and a command preview.
vision_finetuning_plan() documents SWIFT/LLaMA-Factory as the future MiniCPM-V fine-tuning path.
Keeps execute_training false until dependencies, hardware, and dataset schema are approved.

`training/reward_eval.py`

Deterministic local reward-style evaluation helpers.

RewardEvaluator.evaluate() scores supplied responses with transparent lexical heuristics.
best_of_n() selects the highest-scoring candidate without model calls.
create_dpo_pairs() creates chosen/rejected pairs for DPO-style datasets.
eval_lora_vs_base() compares base and LoRA response rewards.

`training/planner.py`

Non-executing LoRA training planner.

load_training_config() reads LoRA and training settings from config/training.yaml.
build_training_plan() creates a dry-run plan with checkpoint output path.
validate_training_plan() checks dataset existence and numeric training settings.
training_hardware_notes() documents practical local hardware expectations.

`tracking/trackio_client.py`

Tracking client with JSONL fallback.

load_tracking_config() reads Trackio settings from config/training.yaml.
TrackingClient.init() starts Trackio only when enabled and installed.
TrackingClient.log() always writes local JSONL and optionally forwards to Trackio.
TrackingClient.finish() closes optional Trackio state.
export_traces() copies local traces to exports/traces.jsonl.
read_trace_rows() returns recent local trace rows for the UI.

`core/events.py`

Small event bus reserved for future cross-module events.

EventType names app events.
UI_ERROR records visible tab-level failures.
Event carries event data.
EventBus registers handlers and emits events.

`core/app_state.py`

Shared local app state.

AppState.emit() records events, logs them, and dispatches them through EventBus.
AppState.emit() also writes trace events through TrackingClient.
AppState.recent_events() returns local trace previews for the Traces tab.
emit_inference_response() records shared response metadata.

`core/tab_feedback.py`

Formats tab status text and emits ui_error events for visible tab-level failures.

`ui/progress.py`

Defines the shared Gradio progress mode used by tab button callbacks.

`core/app_logging.py`

Lightweight logging setup.

configure_app_logging() configures compact process logging once.

`core/registry.py`

Generic registry helper.

Registry.register(name, item) stores a service.
Registry.get(name) retrieves a service.
Registry.list() lists registered services.

Current Design Rule

The app must not download model weights on startup. Model loading should happen only after the user chooses a backend/model and clicks an explicit action.