workbench / docs /ARCHITECTURE.md
GitHub Actions
Initial ZeroGPU deployment with spaces shim
7f9dfed
|
Raw
History Blame Contribute Delete
21.7 kB

A newer version of the Gradio SDK is available: 6.19.0

Upgrade

Architecture

The project is intentionally small at first. The PRD describes a large workbench; this repo starts with the smallest version that can grow into it.

High-Level Flow

app.py
  loads config/models.yaml
  configures lightweight logging
  builds Gradio tabs
  passes model catalog to UI modules

ui/*
  defines each Gradio tab
  calls service classes
  emits local app events for inference, datasets, and field notes
  uses shared progress settings for callback loading indicators

agent/*
  holds deterministic local agent planning and trace export helpers

models/*
  holds model catalog, local backend config, and inference services

datasets/*
  stores dataset, synthetic data, and correction-loop helpers

mcp_tools/*
  holds local tool functions, VINDEX call planning, and Gradio-native MCP bridge metadata

config/*
  holds model and training settings

training/*
  holds non-executing training, LoRA request, evaluation, and export planning helpers

tracking/*
  holds local JSONL tracing and optional Trackio integration

deployment/*
  holds Hugging Face Space deployment planning and validation helpers

plant/*
  holds the first reference domain app built from the template
  can run standalone with python -m plant.app --no-model
  keeps heavy model dependencies optional

core/*
  shared app state, event, logging, and registry helpers

Files And Classes

app.py

Builds and launches the Gradio app.

  • build_app() creates the Gradio Blocks app.
  • Loads the model catalog from config/models.yaml.
  • Registers the current UI tabs.
  • APP_CSS defines compact responsive layout rules for app width, mobile padding, scrollable tabs, and button touch targets.

plant/app.py

Standalone Plant Discovery reference app built from the template.

  • build_app(no_model=True) creates a Gradio app without loading model weights.
  • Loads plant/models.yaml.
  • Builds a local species index.
  • Reuses datasets.field_notes.FieldNoteStore for corrections.
  • Uses DemoPlantVisionService for screenshots/tests or PlantVisionService for OpenBMB MiniCPM-V zero-shot and fine-tuned adapter inference.

plant/plant_service.py

Domain service and schema for Plant Discovery.

  • PlantID is the structured output schema.
  • DemoPlantVisionService provides deterministic no-model results.
  • PlantVisionService lazy-loads optional MiniCPM-V dependencies only during identification.
  • PlantVisionService.from_config(..., "plant_vlm_finetuned") can load a PEFT adapter after a real adapter repo is configured.
  • extract_json_object() and parse_plant_response() make model JSON output testable.

plant/training.py

Non-executing training planner for Plant Discovery.

  • build_plant_training_plan() returns SWIFT and LLaMA-Factory command previews.
  • plant_training_dependency_report() reports optional training dependency availability.
  • write_llamafactory_dataset_info() writes a dataset-info preview for LLaMA-Factory workflows.
  • Training is never started by the Gradio UI or script.

plant/plant_loader.py

Domain data and export helpers for Plant Discovery.

  • PlantRecord normalizes plant examples into training rows.
  • LocalFolderLoader maps species folders to image metadata.
  • SpeciesIndexBuilder builds a no-network species index with demo fallback.
  • FieldNotesPlantExporter exports corrected field notes to plant training JSONL.

plant/plant_tab.py

Focused Gradio UI for Plant Discovery.

  • Identify tab uploads images and renders a safe escaped result card.
  • Field Guide tab searches the species index.
  • Corrections tab saves and exports training-ready corrections.
  • Stats tab summarizes species and correction counts.
  • Training is represented as a non-executing plan, not a subprocess.

plant/plant_tools.py

Optional local/MCP tools for Plant Discovery.

  • Pure functions can be tested without an MCP server.
  • build_mcp_server() imports mcp only when explicitly requested.
  • Tools expose identify, species search, correction save/export, stats, and training plan.

models/model_catalog.py

Reads model configuration and turns it into typed Python objects.

  • ModelInfo describes one configured model.
  • load_model_catalog(path) reads YAML and returns all configured models.
  • model_choices(catalog, model_type) filters models for a UI dropdown.
  • model_summary(model) returns display metadata for the Gradio JSON panel.
  • backend_capabilities maps each model to supported local backend capabilities.

models/placeholder_service.py

Deterministic placeholder model service used before real inference is wired.

  • PlaceholderModelService.chat() returns a deterministic text response.
  • PlaceholderModelService.vision_chat() returns a deterministic image/prompt response.

This file should be replaced or complemented by real services such as:

  • ollama_service.py
  • llama_cpp_service.py
  • openai_compatible_service.py
  • sglang_runner.py
  • minicpm_vision.py
  • transformers_text.py
  • sglang_service.py

models/base.py

Defines service contracts and backend status records.

  • BackendStatus describes whether a backend is available.
  • TextModelService is the text chat protocol.
  • VisionModelService is the vision chat protocol.

models/ollama_service.py

Ollama-backed local inference client.

  • Checks whether ollama is installed and reachable.
  • Sends text and vision chat requests to http://127.0.0.1:11434/api/chat.
  • Lists locally available Ollama models through /api/tags.
  • Builds explicit ollama pull <model> commands for the Status tab.
  • Does not pull or download models automatically.

models/llama_cpp_service.py

llama.cpp HTTP client for local GGUF inference.

  • Checks whether llama-server is installed and reachable.
  • Builds explicit llama-server -m <model.gguf> commands.
  • Supports --mmproj <mmproj.gguf> command metadata for multimodal models.
  • Sends text chat requests to /v1/chat/completions.
  • Does not download GGUF files or start background servers automatically.

models/local_backend_config.py

User-local backend settings stored under ignored data/local_backends.yaml.

  • LocalBackendConfig stores llama.cpp server URL, OpenAI-compatible base URL, optional served model name, GGUF path, mmproj path, context length, and GPU layers.
  • save_local_backend_config() writes local-only settings without touching tracked model config.
  • build_llama_server_command() returns the explicit command the user can run.
  • local_backend_summary() reports file status and confirms no startup downloads or automatic model loads.

models/openai_compatible_service.py

Local OpenAI-compatible chat client for LM Studio, vLLM-style servers, or similar local endpoints.

  • Checks /v1/models for reachability.
  • Sends text chat requests to /v1/chat/completions.
  • Supports an optional served-model-name override for tools such as LM Studio.
  • Returns visible unavailable/request-failed messages instead of crashing the Gradio callback.
  • Does not call cloud APIs or download model weights.

models/llama_cpp_python_service.py

Optional direct Python binding backend for GGUF inference.

  • Checks whether llama_cpp is importable.
  • Requires an explicit local GGUF path.
  • Does not download model files.
  • Provides text chat through Llama.create_chat_completion().
  • Vision support remains routed through llama-server until mmproj/image serialization is wired.

models/minicpm_vision.py

Optional MiniCPM vision backend.

  • Checks whether the transformers package is available.
  • Lazy-loads AutoProcessor and AutoModelForImageTextToText only when selected.
  • Formats image/text messages for image-text-to-text generation.
  • Maps thinking mode into the prompt template.
  • Provides a video support plan for future local frame sampling.

models/sglang_runner.py

SGLang local server planner and OpenAI-compatible chat client.

  • Builds an explicit python -m sglang.launch_server command.
  • Includes MiniCPM tool parser configuration.
  • Checks /health, sends chat requests to /v1/chat/completions, and can request /shutdown.
  • Does not install SGLang, start a process, download model weights, or load a model on app startup.

models/vllm_runner.py

vLLM local server planner and OpenAI-compatible chat client.

  • Builds explicit vllm serve <model> command plans.
  • Checks /health, parses Prometheus-style /metrics, and sends chat requests to /v1/chat/completions.
  • Logs parsed benchmark metrics through TrackingClient.
  • Does not install vLLM, start a process, download model weights, or load a model on app startup.

models/transformers_text.py

Optional Transformers text backend.

  • Checks whether the transformers package is installed.
  • Lazy-loads AutoTokenizer and AutoModelForCausalLM only when the backend is selected.
  • Reads trust_remote_code, device map, dtype, max token, and temperature settings from explicit config.
  • Provides a simple token-list streaming helper for future Gradio streaming wiring.
  • Does not download model weights on startup.

models/service_factory.py

Creates the selected backend service for the UI.

  • TEXT_SERVICE_REGISTRY registers available text backend factories.
  • VISION_SERVICE_REGISTRY registers available vision backend factories.
  • create_text_service() chooses placeholder, llama.cpp, llama-cpp-python, Ollama, OpenAI-compatible, SGLang, or Transformers text service.
  • create_vision_service() chooses placeholder, llama.cpp, llama-cpp-python, Ollama, or Transformers MiniCPM vision service.
  • backend_statuses() reports current backend availability.
  • llama.cpp, llama-cpp-python, and OpenAI-compatible services read ignored local backend settings when selected.

ui/chat_tab.py

Builds the text chat tab.

  • Shows text models from the catalog.
  • Displays selected model metadata.
  • Calls the selected backend service.
  • Emits inference request and response events.

ui/vision_tab.py

Builds the vision tab.

  • Shows vision models from the catalog.
  • Accepts an image and prompt.
  • Calls the selected backend service.
  • Emits inference request and response events.

ui/dataset_tab.py

Local dataset preview surface.

  • Previews local CSV, JSONL, and NDJSON files.
  • Previews Hugging Face datasets when the optional external datasets package is installed.
  • Shows source, row count, columns, and sample rows.
  • Calculates basic local dataset statistics.
  • Emits dataset loaded events.

Future behavior:

  • Serve dataset tools through the selected MCP path.

ui/train_tab.py

Training planning and local evaluation surface.

  • Builds a LoRA dry-run training plan without launching training.
  • Builds a non-executing LoRA trainer request with dependency status.
  • Shows SWIFT/LLaMA-Factory vision fine-tuning plan.
  • Shows checkpoint output path, validation status, and hardware notes.
  • Runs local base-vs-tuned evaluation from newline-separated response text.
  • Shows exact-match summary and a qualitative eval table.
  • Logs tuned evaluation reports to data/eval_results.jsonl.

Future behavior:

  • Start LoRA training.
  • Show loss and metrics.
  • Write Trackio traces.

ui/vllm_tab.py

vLLM local serving planner.

  • Builds explicit vllm serve command plans.
  • Checks local vLLM /health.
  • Fetches and parses /metrics.
  • Logs vLLM benchmark metrics through local JSONL/Trackio fallback tracking.
  • Does not install vLLM, start a process, download models, or load weights on startup.

ui/export_tab.py

GGUF export planning surface.

  • Selects a configured model and quantization.
  • Shows official GGUF download command plans when the model has GGUF metadata.
  • Shows local HF-to-GGUF conversion and llama.cpp quantization command plans.
  • Lists files already present under the selected export directory.
  • Exposes existing exported files through a Gradio download output.
  • Does not execute downloads, conversion, or quantization.

Future behavior:

  • Execute downloads and conversions after explicit user action.

ui/notes_tab.py

Field notes implementation.

  • Saves prompt, model response, correction, and tags to data/field_notes.csv.
  • Captures optional image path, video path, and a use-for-training flag.
  • Exports corrected notes to JSONL.
  • Exports local Hugging Face Dataset-style files under data/hf_field_notes.
  • Imports uncertain OCR predictions for human correction.
  • Exports corrected OCR rows to JSONL.
  • Emits field note saved events.

Future behavior:

  • Push corrected notes to a remote Hugging Face Dataset after login.
  • Feed notes into fine-tuning.

ui/traces_tab.py

Local trace and tracking preview.

  • Shows manual trace event previews.
  • Shows recent local app events.
  • Shows JSONL trace rows and tracking status.
  • Exports local traces to exports/traces.jsonl.
  • Calls Trackio only when the optional package is installed and enabled.

ui/agent_tab.py

Local non-autonomous agent mode.

  • Drafts a research-plan-implement-verify trace.
  • Saves agent traces to data/agent_traces.jsonl.
  • Exports trace JSONL and local HF Dataset-style trace files.
  • Does not execute shell commands, commit, push, deploy, download models, or call external services.

ui/status_tab.py

Shows configured models and backend metadata.

  • Helps verify model-size compliance and backend status.
  • Provides local llama.cpp settings, GGUF/mmproj file pickers, and command generation.
  • Provides LM Studio/OpenAI-compatible base URL, optional model-name storage, and reachability check.
  • Provides SGLang command planning, health check, and shutdown request controls.

datasets/field_notes.py

Field note data model and CSV store.

  • FieldNote captures prompt, response, correction, tags, and timestamp.
  • FieldNote also captures optional image/video paths and a training inclusion flag.
  • FieldNoteStore.save() persists notes to data/field_notes.csv.
  • FieldNoteStore.list_notes() filters by correction, tag, and training inclusion.
  • FieldNoteStore.export_jsonl() writes training-ready JSONL.
  • FieldNoteStore.export_hf_dataset() writes local HF Dataset-style files.
  • SQLiteFieldNoteStore stores and lists notes in SQLite for larger correction loops.

datasets/loader.py

Dataset preview and statistics helpers.

  • preview_local_dataset() previews CSV, JSONL, and NDJSON files.
  • dataset_statistics() reports row count, column count, names, and non-empty counts.
  • preview_huggingface_dataset() optionally uses the external Hugging Face datasets package.

datasets/synthetic.py

Deterministic local synthetic data helpers.

  • generate_synthetic_examples() creates local prompt/response/correction examples.
  • validate_synthetic_example() checks schema requirements.
  • quality_filter_examples() removes incomplete or low-value examples.
  • augment_examples() creates deterministic variants for workflow testing.
  • export_synthetic_jsonl() writes JSONL without external services.

datasets/ocr.py

Local OCR correction helpers.

  • OCRPrediction stores source path, predicted text, confidence, and optional page.
  • load_ocr_predictions() loads local .csv, .jsonl, and .ndjson prediction files.
  • uncertain_predictions() filters rows at or below a confidence threshold or with empty text.
  • import_uncertain_predictions() creates Field Notes correction tasks for uncertain rows.
  • export_corrected_ocr_notes() writes corrected OCR examples to JSONL for evaluation or training.
  • ocr_import_summary() previews uncertain rows for the Field Notes tab.

mcp_tools/tools.py

Local MCP-style tools.

  • dataset_stats_tool() returns local dataset statistics.
  • hf_dataset_preview_tool() previews Hugging Face datasets when optional dependencies exist.
  • safe_calculator_tool() evaluates numeric arithmetic only.
  • model_inference_tool() routes text prompts through the selected model service.
  • tool_registry() returns the local tool map for a future MCP endpoint.

mcp_tools/vindex_tool.py

Non-executing VINDEX integration boundary.

  • Defines the eight VINDEX PRD methods and their local FastAPI paths.
  • build_vindex_call_plan() validates method names and builds endpoint/payload plans.
  • Caps star_spread.n_neighbors at 5 and calibrated_edit.causal_window at 3 based on the PRD safety notes.
  • vindex_dependency_report() checks whether the optional vindex package or local health endpoint is available.
  • vindex_verification_report() combines dependency status with a safe call plan and keeps execution disabled until the local VINDEX install is verified.

mcp_tools/bridge.py

Gradio-native MCP bridge metadata and local invocation helper.

  • MCP_PATH documents /gradio_api/mcp/sse.
  • mcp_manifest() returns the selected mode, path, and tool definitions.
  • invoke_mcp_tool() verifies local tool invocation by name.

agent/runner.py

Deterministic local agent trace runner.

  • AGENT_SYSTEM_PROMPT defines the agent behavior contract.
  • run_agent_loop() produces research, plan, implement, and verify trace steps.
  • run_paper_to_code_loop() produces paper-to-code research, plan, implement, and verify trace steps.
  • default_safety_gates() lists the non-autonomous safety requirements.
  • save_agent_trace() appends traces to JSONL.
  • export_agent_traces() exports trace JSONL.
  • export_agent_traces_hf_dataset() writes local HF Dataset-style trace files.
  • The runner can call safe local tools, but it is not autonomous.

core/file_exports.py

Shared export helper.

  • copy_text_file_or_empty() copies a text artifact to an export path or creates an empty one.

training/export.py

Non-executing GGUF export planning.

  • detect_llama_cpp_tools() checks llama-server, llama-cli, and llama-quantize.
  • build_export_plan() creates explicit download, conversion, and quantization command plans.
  • list_exported_files() lists generated/local export files.
  • ExportPlan.as_dict() marks that commands are not executed and no startup downloads occur.

training/evaluation.py

Local deterministic evaluation helpers.

  • default_prompt_cases() returns a small built-in prompt test set.
  • load_prompt_cases() loads prompt/expected pairs from JSONL.
  • evaluate_responses() computes exact-match rows and a qualitative table.
  • perplexity_from_losses() computes perplexity from explicit negative log likelihood values.
  • compare_base_vs_tuned() reports exact-match delta.
  • log_eval_report() appends JSONL evaluation results.

training/lora_trainer.py

Non-executing LoRA trainer request builder.

  • lora_dependency_report() reports PEFT, TRL, Transformers, and Torch availability.
  • build_lora_training_request() combines the training plan with dependency status and a command preview.
  • vision_finetuning_plan() documents SWIFT/LLaMA-Factory as the future MiniCPM-V fine-tuning path.
  • Keeps execute_training false until dependencies, hardware, and dataset schema are approved.

training/reward_eval.py

Deterministic local reward-style evaluation helpers.

  • RewardEvaluator.evaluate() scores supplied responses with transparent lexical heuristics.
  • best_of_n() selects the highest-scoring candidate without model calls.
  • create_dpo_pairs() creates chosen/rejected pairs for DPO-style datasets.
  • eval_lora_vs_base() compares base and LoRA response rewards.

training/planner.py

Non-executing LoRA training planner.

  • load_training_config() reads LoRA and training settings from config/training.yaml.
  • build_training_plan() creates a dry-run plan with checkpoint output path.
  • validate_training_plan() checks dataset existence and numeric training settings.
  • training_hardware_notes() documents practical local hardware expectations.

tracking/trackio_client.py

Tracking client with JSONL fallback.

  • load_tracking_config() reads Trackio settings from config/training.yaml.
  • TrackingClient.init() starts Trackio only when enabled and installed.
  • TrackingClient.log() always writes local JSONL and optionally forwards to Trackio.
  • TrackingClient.finish() closes optional Trackio state.
  • export_traces() copies local traces to exports/traces.jsonl.
  • read_trace_rows() returns recent local trace rows for the UI.

core/events.py

Small event bus reserved for future cross-module events.

  • EventType names app events.
  • UI_ERROR records visible tab-level failures.
  • Event carries event data.
  • EventBus registers handlers and emits events.

core/app_state.py

Shared local app state.

  • AppState.emit() records events, logs them, and dispatches them through EventBus.
  • AppState.emit() also writes trace events through TrackingClient.
  • AppState.recent_events() returns local trace previews for the Traces tab.
  • emit_inference_response() records shared response metadata.

core/tab_feedback.py

Formats tab status text and emits ui_error events for visible tab-level failures.

ui/progress.py

Defines the shared Gradio progress mode used by tab button callbacks.

core/app_logging.py

Lightweight logging setup.

  • configure_app_logging() configures compact process logging once.

core/registry.py

Generic registry helper.

  • Registry.register(name, item) stores a service.
  • Registry.get(name) retrieves a service.
  • Registry.list() lists registered services.

Current Design Rule

The app must not download model weights on startup. Model loading should happen only after the user chooses a backend/model and clicks an explicit action.