workbench / docs /ARCHITECTURE.md
GitHub Actions
Initial ZeroGPU deployment with spaces shim
7f9dfed
|
Raw
History Blame Contribute Delete
21.7 kB
# Architecture
The project is intentionally small at first. The PRD describes a large workbench; this repo starts
with the smallest version that can grow into it.
## High-Level Flow
```text
app.py
loads config/models.yaml
configures lightweight logging
builds Gradio tabs
passes model catalog to UI modules
ui/*
defines each Gradio tab
calls service classes
emits local app events for inference, datasets, and field notes
uses shared progress settings for callback loading indicators
agent/*
holds deterministic local agent planning and trace export helpers
models/*
holds model catalog, local backend config, and inference services
datasets/*
stores dataset, synthetic data, and correction-loop helpers
mcp_tools/*
holds local tool functions, VINDEX call planning, and Gradio-native MCP bridge metadata
config/*
holds model and training settings
training/*
holds non-executing training, LoRA request, evaluation, and export planning helpers
tracking/*
holds local JSONL tracing and optional Trackio integration
deployment/*
holds Hugging Face Space deployment planning and validation helpers
plant/*
holds the first reference domain app built from the template
can run standalone with python -m plant.app --no-model
keeps heavy model dependencies optional
core/*
shared app state, event, logging, and registry helpers
```
## Files And Classes
### `app.py`
Builds and launches the Gradio app.
- `build_app()` creates the Gradio `Blocks` app.
- Loads the model catalog from `config/models.yaml`.
- Registers the current UI tabs.
- `APP_CSS` defines compact responsive layout rules for app width, mobile padding, scrollable tabs,
and button touch targets.
### `plant/app.py`
Standalone Plant Discovery reference app built from the template.
- `build_app(no_model=True)` creates a Gradio app without loading model weights.
- Loads `plant/models.yaml`.
- Builds a local species index.
- Reuses `datasets.field_notes.FieldNoteStore` for corrections.
- Uses `DemoPlantVisionService` for screenshots/tests or `PlantVisionService` for OpenBMB
MiniCPM-V zero-shot and fine-tuned adapter inference.
### `plant/plant_service.py`
Domain service and schema for Plant Discovery.
- `PlantID` is the structured output schema.
- `DemoPlantVisionService` provides deterministic no-model results.
- `PlantVisionService` lazy-loads optional MiniCPM-V dependencies only during identification.
- `PlantVisionService.from_config(..., "plant_vlm_finetuned")` can load a PEFT adapter after a real
adapter repo is configured.
- `extract_json_object()` and `parse_plant_response()` make model JSON output testable.
### `plant/training.py`
Non-executing training planner for Plant Discovery.
- `build_plant_training_plan()` returns SWIFT and LLaMA-Factory command previews.
- `plant_training_dependency_report()` reports optional training dependency availability.
- `write_llamafactory_dataset_info()` writes a dataset-info preview for LLaMA-Factory workflows.
- Training is never started by the Gradio UI or script.
### `plant/plant_loader.py`
Domain data and export helpers for Plant Discovery.
- `PlantRecord` normalizes plant examples into training rows.
- `LocalFolderLoader` maps species folders to image metadata.
- `SpeciesIndexBuilder` builds a no-network species index with demo fallback.
- `FieldNotesPlantExporter` exports corrected field notes to plant training JSONL.
### `plant/plant_tab.py`
Focused Gradio UI for Plant Discovery.
- Identify tab uploads images and renders a safe escaped result card.
- Field Guide tab searches the species index.
- Corrections tab saves and exports training-ready corrections.
- Stats tab summarizes species and correction counts.
- Training is represented as a non-executing plan, not a subprocess.
### `plant/plant_tools.py`
Optional local/MCP tools for Plant Discovery.
- Pure functions can be tested without an MCP server.
- `build_mcp_server()` imports `mcp` only when explicitly requested.
- Tools expose identify, species search, correction save/export, stats, and training plan.
### `models/model_catalog.py`
Reads model configuration and turns it into typed Python objects.
- `ModelInfo` describes one configured model.
- `load_model_catalog(path)` reads YAML and returns all configured models.
- `model_choices(catalog, model_type)` filters models for a UI dropdown.
- `model_summary(model)` returns display metadata for the Gradio JSON panel.
- `backend_capabilities` maps each model to supported local backend capabilities.
### `models/placeholder_service.py`
Deterministic placeholder model service used before real inference is wired.
- `PlaceholderModelService.chat()` returns a deterministic text response.
- `PlaceholderModelService.vision_chat()` returns a deterministic image/prompt response.
This file should be replaced or complemented by real services such as:
- `ollama_service.py`
- `llama_cpp_service.py`
- `openai_compatible_service.py`
- `sglang_runner.py`
- `minicpm_vision.py`
- `transformers_text.py`
- `sglang_service.py`
### `models/base.py`
Defines service contracts and backend status records.
- `BackendStatus` describes whether a backend is available.
- `TextModelService` is the text chat protocol.
- `VisionModelService` is the vision chat protocol.
### `models/ollama_service.py`
Ollama-backed local inference client.
- Checks whether `ollama` is installed and reachable.
- Sends text and vision chat requests to `http://127.0.0.1:11434/api/chat`.
- Lists locally available Ollama models through `/api/tags`.
- Builds explicit `ollama pull <model>` commands for the Status tab.
- Does not pull or download models automatically.
### `models/llama_cpp_service.py`
llama.cpp HTTP client for local GGUF inference.
- Checks whether `llama-server` is installed and reachable.
- Builds explicit `llama-server -m <model.gguf>` commands.
- Supports `--mmproj <mmproj.gguf>` command metadata for multimodal models.
- Sends text chat requests to `/v1/chat/completions`.
- Does not download GGUF files or start background servers automatically.
### `models/local_backend_config.py`
User-local backend settings stored under ignored `data/local_backends.yaml`.
- `LocalBackendConfig` stores llama.cpp server URL, OpenAI-compatible base URL, optional served
model name, GGUF path, mmproj path, context length, and GPU layers.
- `save_local_backend_config()` writes local-only settings without touching tracked model config.
- `build_llama_server_command()` returns the explicit command the user can run.
- `local_backend_summary()` reports file status and confirms no startup downloads or automatic model loads.
### `models/openai_compatible_service.py`
Local OpenAI-compatible chat client for LM Studio, vLLM-style servers, or similar local endpoints.
- Checks `/v1/models` for reachability.
- Sends text chat requests to `/v1/chat/completions`.
- Supports an optional served-model-name override for tools such as LM Studio.
- Returns visible unavailable/request-failed messages instead of crashing the Gradio callback.
- Does not call cloud APIs or download model weights.
### `models/llama_cpp_python_service.py`
Optional direct Python binding backend for GGUF inference.
- Checks whether `llama_cpp` is importable.
- Requires an explicit local GGUF path.
- Does not download model files.
- Provides text chat through `Llama.create_chat_completion()`.
- Vision support remains routed through llama-server until mmproj/image serialization is wired.
### `models/minicpm_vision.py`
Optional MiniCPM vision backend.
- Checks whether the `transformers` package is available.
- Lazy-loads `AutoProcessor` and `AutoModelForImageTextToText` only when selected.
- Formats image/text messages for image-text-to-text generation.
- Maps thinking mode into the prompt template.
- Provides a video support plan for future local frame sampling.
### `models/sglang_runner.py`
SGLang local server planner and OpenAI-compatible chat client.
- Builds an explicit `python -m sglang.launch_server` command.
- Includes MiniCPM tool parser configuration.
- Checks `/health`, sends chat requests to `/v1/chat/completions`, and can request `/shutdown`.
- Does not install SGLang, start a process, download model weights, or load a model on app startup.
### `models/vllm_runner.py`
vLLM local server planner and OpenAI-compatible chat client.
- Builds explicit `vllm serve <model>` command plans.
- Checks `/health`, parses Prometheus-style `/metrics`, and sends chat requests to
`/v1/chat/completions`.
- Logs parsed benchmark metrics through `TrackingClient`.
- Does not install vLLM, start a process, download model weights, or load a model on app startup.
### `models/transformers_text.py`
Optional Transformers text backend.
- Checks whether the `transformers` package is installed.
- Lazy-loads `AutoTokenizer` and `AutoModelForCausalLM` only when the backend is selected.
- Reads `trust_remote_code`, device map, dtype, max token, and temperature settings from explicit config.
- Provides a simple token-list streaming helper for future Gradio streaming wiring.
- Does not download model weights on startup.
### `models/service_factory.py`
Creates the selected backend service for the UI.
- `TEXT_SERVICE_REGISTRY` registers available text backend factories.
- `VISION_SERVICE_REGISTRY` registers available vision backend factories.
- `create_text_service()` chooses placeholder, llama.cpp, llama-cpp-python, Ollama,
OpenAI-compatible, SGLang, or Transformers text service.
- `create_vision_service()` chooses placeholder, llama.cpp, llama-cpp-python, Ollama, or
Transformers MiniCPM vision service.
- `backend_statuses()` reports current backend availability.
- llama.cpp, llama-cpp-python, and OpenAI-compatible services read ignored local backend settings
when selected.
### `ui/chat_tab.py`
Builds the text chat tab.
- Shows text models from the catalog.
- Displays selected model metadata.
- Calls the selected backend service.
- Emits inference request and response events.
### `ui/vision_tab.py`
Builds the vision tab.
- Shows vision models from the catalog.
- Accepts an image and prompt.
- Calls the selected backend service.
- Emits inference request and response events.
### `ui/dataset_tab.py`
Local dataset preview surface.
- Previews local CSV, JSONL, and NDJSON files.
- Previews Hugging Face datasets when the optional external `datasets` package is installed.
- Shows source, row count, columns, and sample rows.
- Calculates basic local dataset statistics.
- Emits dataset loaded events.
Future behavior:
- Serve dataset tools through the selected MCP path.
### `ui/train_tab.py`
Training planning and local evaluation surface.
- Builds a LoRA dry-run training plan without launching training.
- Builds a non-executing LoRA trainer request with dependency status.
- Shows SWIFT/LLaMA-Factory vision fine-tuning plan.
- Shows checkpoint output path, validation status, and hardware notes.
- Runs local base-vs-tuned evaluation from newline-separated response text.
- Shows exact-match summary and a qualitative eval table.
- Logs tuned evaluation reports to `data/eval_results.jsonl`.
Future behavior:
- Start LoRA training.
- Show loss and metrics.
- Write Trackio traces.
### `ui/vllm_tab.py`
vLLM local serving planner.
- Builds explicit `vllm serve` command plans.
- Checks local vLLM `/health`.
- Fetches and parses `/metrics`.
- Logs vLLM benchmark metrics through local JSONL/Trackio fallback tracking.
- Does not install vLLM, start a process, download models, or load weights on startup.
### `ui/export_tab.py`
GGUF export planning surface.
- Selects a configured model and quantization.
- Shows official GGUF download command plans when the model has GGUF metadata.
- Shows local HF-to-GGUF conversion and llama.cpp quantization command plans.
- Lists files already present under the selected export directory.
- Exposes existing exported files through a Gradio download output.
- Does not execute downloads, conversion, or quantization.
Future behavior:
- Execute downloads and conversions after explicit user action.
### `ui/notes_tab.py`
Field notes implementation.
- Saves prompt, model response, correction, and tags to `data/field_notes.csv`.
- Captures optional image path, video path, and a use-for-training flag.
- Exports corrected notes to JSONL.
- Exports local Hugging Face Dataset-style files under `data/hf_field_notes`.
- Imports uncertain OCR predictions for human correction.
- Exports corrected OCR rows to JSONL.
- Emits field note saved events.
Future behavior:
- Push corrected notes to a remote Hugging Face Dataset after login.
- Feed notes into fine-tuning.
### `ui/traces_tab.py`
Local trace and tracking preview.
- Shows manual trace event previews.
- Shows recent local app events.
- Shows JSONL trace rows and tracking status.
- Exports local traces to `exports/traces.jsonl`.
- Calls Trackio only when the optional package is installed and enabled.
### `ui/agent_tab.py`
Local non-autonomous agent mode.
- Drafts a research-plan-implement-verify trace.
- Saves agent traces to `data/agent_traces.jsonl`.
- Exports trace JSONL and local HF Dataset-style trace files.
- Does not execute shell commands, commit, push, deploy, download models, or call external services.
### `ui/status_tab.py`
Shows configured models and backend metadata.
- Helps verify model-size compliance and backend status.
- Provides local llama.cpp settings, GGUF/mmproj file pickers, and command generation.
- Provides LM Studio/OpenAI-compatible base URL, optional model-name storage, and reachability check.
- Provides SGLang command planning, health check, and shutdown request controls.
### `datasets/field_notes.py`
Field note data model and CSV store.
- `FieldNote` captures prompt, response, correction, tags, and timestamp.
- `FieldNote` also captures optional image/video paths and a training inclusion flag.
- `FieldNoteStore.save()` persists notes to `data/field_notes.csv`.
- `FieldNoteStore.list_notes()` filters by correction, tag, and training inclusion.
- `FieldNoteStore.export_jsonl()` writes training-ready JSONL.
- `FieldNoteStore.export_hf_dataset()` writes local HF Dataset-style files.
- `SQLiteFieldNoteStore` stores and lists notes in SQLite for larger correction loops.
### `datasets/loader.py`
Dataset preview and statistics helpers.
- `preview_local_dataset()` previews CSV, JSONL, and NDJSON files.
- `dataset_statistics()` reports row count, column count, names, and non-empty counts.
- `preview_huggingface_dataset()` optionally uses the external Hugging Face `datasets` package.
### `datasets/synthetic.py`
Deterministic local synthetic data helpers.
- `generate_synthetic_examples()` creates local prompt/response/correction examples.
- `validate_synthetic_example()` checks schema requirements.
- `quality_filter_examples()` removes incomplete or low-value examples.
- `augment_examples()` creates deterministic variants for workflow testing.
- `export_synthetic_jsonl()` writes JSONL without external services.
### `datasets/ocr.py`
Local OCR correction helpers.
- `OCRPrediction` stores source path, predicted text, confidence, and optional page.
- `load_ocr_predictions()` loads local `.csv`, `.jsonl`, and `.ndjson` prediction files.
- `uncertain_predictions()` filters rows at or below a confidence threshold or with empty text.
- `import_uncertain_predictions()` creates Field Notes correction tasks for uncertain rows.
- `export_corrected_ocr_notes()` writes corrected OCR examples to JSONL for evaluation or training.
- `ocr_import_summary()` previews uncertain rows for the Field Notes tab.
### `mcp_tools/tools.py`
Local MCP-style tools.
- `dataset_stats_tool()` returns local dataset statistics.
- `hf_dataset_preview_tool()` previews Hugging Face datasets when optional dependencies exist.
- `safe_calculator_tool()` evaluates numeric arithmetic only.
- `model_inference_tool()` routes text prompts through the selected model service.
- `tool_registry()` returns the local tool map for a future MCP endpoint.
### `mcp_tools/vindex_tool.py`
Non-executing VINDEX integration boundary.
- Defines the eight VINDEX PRD methods and their local FastAPI paths.
- `build_vindex_call_plan()` validates method names and builds endpoint/payload plans.
- Caps `star_spread.n_neighbors` at 5 and `calibrated_edit.causal_window` at 3 based on the PRD
safety notes.
- `vindex_dependency_report()` checks whether the optional `vindex` package or local health
endpoint is available.
- `vindex_verification_report()` combines dependency status with a safe call plan and keeps
execution disabled until the local VINDEX install is verified.
### `mcp_tools/bridge.py`
Gradio-native MCP bridge metadata and local invocation helper.
- `MCP_PATH` documents `/gradio_api/mcp/sse`.
- `mcp_manifest()` returns the selected mode, path, and tool definitions.
- `invoke_mcp_tool()` verifies local tool invocation by name.
### `agent/runner.py`
Deterministic local agent trace runner.
- `AGENT_SYSTEM_PROMPT` defines the agent behavior contract.
- `run_agent_loop()` produces research, plan, implement, and verify trace steps.
- `run_paper_to_code_loop()` produces paper-to-code research, plan, implement, and verify trace steps.
- `default_safety_gates()` lists the non-autonomous safety requirements.
- `save_agent_trace()` appends traces to JSONL.
- `export_agent_traces()` exports trace JSONL.
- `export_agent_traces_hf_dataset()` writes local HF Dataset-style trace files.
- The runner can call safe local tools, but it is not autonomous.
### `core/file_exports.py`
Shared export helper.
- `copy_text_file_or_empty()` copies a text artifact to an export path or creates an empty one.
### `training/export.py`
Non-executing GGUF export planning.
- `detect_llama_cpp_tools()` checks `llama-server`, `llama-cli`, and `llama-quantize`.
- `build_export_plan()` creates explicit download, conversion, and quantization command plans.
- `list_exported_files()` lists generated/local export files.
- `ExportPlan.as_dict()` marks that commands are not executed and no startup downloads occur.
### `training/evaluation.py`
Local deterministic evaluation helpers.
- `default_prompt_cases()` returns a small built-in prompt test set.
- `load_prompt_cases()` loads prompt/expected pairs from JSONL.
- `evaluate_responses()` computes exact-match rows and a qualitative table.
- `perplexity_from_losses()` computes perplexity from explicit negative log likelihood values.
- `compare_base_vs_tuned()` reports exact-match delta.
- `log_eval_report()` appends JSONL evaluation results.
### `training/lora_trainer.py`
Non-executing LoRA trainer request builder.
- `lora_dependency_report()` reports PEFT, TRL, Transformers, and Torch availability.
- `build_lora_training_request()` combines the training plan with dependency status and a command
preview.
- `vision_finetuning_plan()` documents SWIFT/LLaMA-Factory as the future MiniCPM-V fine-tuning path.
- Keeps `execute_training` false until dependencies, hardware, and dataset schema are approved.
### `training/reward_eval.py`
Deterministic local reward-style evaluation helpers.
- `RewardEvaluator.evaluate()` scores supplied responses with transparent lexical heuristics.
- `best_of_n()` selects the highest-scoring candidate without model calls.
- `create_dpo_pairs()` creates chosen/rejected pairs for DPO-style datasets.
- `eval_lora_vs_base()` compares base and LoRA response rewards.
### `training/planner.py`
Non-executing LoRA training planner.
- `load_training_config()` reads LoRA and training settings from `config/training.yaml`.
- `build_training_plan()` creates a dry-run plan with checkpoint output path.
- `validate_training_plan()` checks dataset existence and numeric training settings.
- `training_hardware_notes()` documents practical local hardware expectations.
### `tracking/trackio_client.py`
Tracking client with JSONL fallback.
- `load_tracking_config()` reads Trackio settings from `config/training.yaml`.
- `TrackingClient.init()` starts Trackio only when enabled and installed.
- `TrackingClient.log()` always writes local JSONL and optionally forwards to Trackio.
- `TrackingClient.finish()` closes optional Trackio state.
- `export_traces()` copies local traces to `exports/traces.jsonl`.
- `read_trace_rows()` returns recent local trace rows for the UI.
### `core/events.py`
Small event bus reserved for future cross-module events.
- `EventType` names app events.
- `UI_ERROR` records visible tab-level failures.
- `Event` carries event data.
- `EventBus` registers handlers and emits events.
### `core/app_state.py`
Shared local app state.
- `AppState.emit()` records events, logs them, and dispatches them through `EventBus`.
- `AppState.emit()` also writes trace events through `TrackingClient`.
- `AppState.recent_events()` returns local trace previews for the Traces tab.
- `emit_inference_response()` records shared response metadata.
### `core/tab_feedback.py`
Formats tab status text and emits `ui_error` events for visible tab-level failures.
### `ui/progress.py`
Defines the shared Gradio progress mode used by tab button callbacks.
### `core/app_logging.py`
Lightweight logging setup.
- `configure_app_logging()` configures compact process logging once.
### `core/registry.py`
Generic registry helper.
- `Registry.register(name, item)` stores a service.
- `Registry.get(name)` retrieves a service.
- `Registry.list()` lists registered services.
## Current Design Rule
The app must not download model weights on startup. Model loading should happen only after the
user chooses a backend/model and clicks an explicit action.