# Architecture The project is intentionally small at first. The PRD describes a large workbench; this repo starts with the smallest version that can grow into it. ## High-Level Flow ```text app.py loads config/models.yaml configures lightweight logging builds Gradio tabs passes model catalog to UI modules ui/* defines each Gradio tab calls service classes emits local app events for inference, datasets, and field notes uses shared progress settings for callback loading indicators agent/* holds deterministic local agent planning and trace export helpers models/* holds model catalog, local backend config, and inference services datasets/* stores dataset, synthetic data, and correction-loop helpers mcp_tools/* holds local tool functions, VINDEX call planning, and Gradio-native MCP bridge metadata config/* holds model and training settings training/* holds non-executing training, LoRA request, evaluation, and export planning helpers tracking/* holds local JSONL tracing and optional Trackio integration deployment/* holds Hugging Face Space deployment planning and validation helpers plant/* holds the first reference domain app built from the template can run standalone with python -m plant.app --no-model keeps heavy model dependencies optional core/* shared app state, event, logging, and registry helpers ``` ## Files And Classes ### `app.py` Builds and launches the Gradio app. - `build_app()` creates the Gradio `Blocks` app. - Loads the model catalog from `config/models.yaml`. - Registers the current UI tabs. - `APP_CSS` defines compact responsive layout rules for app width, mobile padding, scrollable tabs, and button touch targets. ### `plant/app.py` Standalone Plant Discovery reference app built from the template. - `build_app(no_model=True)` creates a Gradio app without loading model weights. - Loads `plant/models.yaml`. - Builds a local species index. - Reuses `datasets.field_notes.FieldNoteStore` for corrections. - Uses `DemoPlantVisionService` for screenshots/tests or `PlantVisionService` for OpenBMB MiniCPM-V zero-shot and fine-tuned adapter inference. ### `plant/plant_service.py` Domain service and schema for Plant Discovery. - `PlantID` is the structured output schema. - `DemoPlantVisionService` provides deterministic no-model results. - `PlantVisionService` lazy-loads optional MiniCPM-V dependencies only during identification. - `PlantVisionService.from_config(..., "plant_vlm_finetuned")` can load a PEFT adapter after a real adapter repo is configured. - `extract_json_object()` and `parse_plant_response()` make model JSON output testable. ### `plant/training.py` Non-executing training planner for Plant Discovery. - `build_plant_training_plan()` returns SWIFT and LLaMA-Factory command previews. - `plant_training_dependency_report()` reports optional training dependency availability. - `write_llamafactory_dataset_info()` writes a dataset-info preview for LLaMA-Factory workflows. - Training is never started by the Gradio UI or script. ### `plant/plant_loader.py` Domain data and export helpers for Plant Discovery. - `PlantRecord` normalizes plant examples into training rows. - `LocalFolderLoader` maps species folders to image metadata. - `SpeciesIndexBuilder` builds a no-network species index with demo fallback. - `FieldNotesPlantExporter` exports corrected field notes to plant training JSONL. ### `plant/plant_tab.py` Focused Gradio UI for Plant Discovery. - Identify tab uploads images and renders a safe escaped result card. - Field Guide tab searches the species index. - Corrections tab saves and exports training-ready corrections. - Stats tab summarizes species and correction counts. - Training is represented as a non-executing plan, not a subprocess. ### `plant/plant_tools.py` Optional local/MCP tools for Plant Discovery. - Pure functions can be tested without an MCP server. - `build_mcp_server()` imports `mcp` only when explicitly requested. - Tools expose identify, species search, correction save/export, stats, and training plan. ### `models/model_catalog.py` Reads model configuration and turns it into typed Python objects. - `ModelInfo` describes one configured model. - `load_model_catalog(path)` reads YAML and returns all configured models. - `model_choices(catalog, model_type)` filters models for a UI dropdown. - `model_summary(model)` returns display metadata for the Gradio JSON panel. - `backend_capabilities` maps each model to supported local backend capabilities. ### `models/placeholder_service.py` Deterministic placeholder model service used before real inference is wired. - `PlaceholderModelService.chat()` returns a deterministic text response. - `PlaceholderModelService.vision_chat()` returns a deterministic image/prompt response. This file should be replaced or complemented by real services such as: - `ollama_service.py` - `llama_cpp_service.py` - `openai_compatible_service.py` - `sglang_runner.py` - `minicpm_vision.py` - `transformers_text.py` - `sglang_service.py` ### `models/base.py` Defines service contracts and backend status records. - `BackendStatus` describes whether a backend is available. - `TextModelService` is the text chat protocol. - `VisionModelService` is the vision chat protocol. ### `models/ollama_service.py` Ollama-backed local inference client. - Checks whether `ollama` is installed and reachable. - Sends text and vision chat requests to `http://127.0.0.1:11434/api/chat`. - Lists locally available Ollama models through `/api/tags`. - Builds explicit `ollama pull ` commands for the Status tab. - Does not pull or download models automatically. ### `models/llama_cpp_service.py` llama.cpp HTTP client for local GGUF inference. - Checks whether `llama-server` is installed and reachable. - Builds explicit `llama-server -m ` commands. - Supports `--mmproj ` command metadata for multimodal models. - Sends text chat requests to `/v1/chat/completions`. - Does not download GGUF files or start background servers automatically. ### `models/local_backend_config.py` User-local backend settings stored under ignored `data/local_backends.yaml`. - `LocalBackendConfig` stores llama.cpp server URL, OpenAI-compatible base URL, optional served model name, GGUF path, mmproj path, context length, and GPU layers. - `save_local_backend_config()` writes local-only settings without touching tracked model config. - `build_llama_server_command()` returns the explicit command the user can run. - `local_backend_summary()` reports file status and confirms no startup downloads or automatic model loads. ### `models/openai_compatible_service.py` Local OpenAI-compatible chat client for LM Studio, vLLM-style servers, or similar local endpoints. - Checks `/v1/models` for reachability. - Sends text chat requests to `/v1/chat/completions`. - Supports an optional served-model-name override for tools such as LM Studio. - Returns visible unavailable/request-failed messages instead of crashing the Gradio callback. - Does not call cloud APIs or download model weights. ### `models/llama_cpp_python_service.py` Optional direct Python binding backend for GGUF inference. - Checks whether `llama_cpp` is importable. - Requires an explicit local GGUF path. - Does not download model files. - Provides text chat through `Llama.create_chat_completion()`. - Vision support remains routed through llama-server until mmproj/image serialization is wired. ### `models/minicpm_vision.py` Optional MiniCPM vision backend. - Checks whether the `transformers` package is available. - Lazy-loads `AutoProcessor` and `AutoModelForImageTextToText` only when selected. - Formats image/text messages for image-text-to-text generation. - Maps thinking mode into the prompt template. - Provides a video support plan for future local frame sampling. ### `models/sglang_runner.py` SGLang local server planner and OpenAI-compatible chat client. - Builds an explicit `python -m sglang.launch_server` command. - Includes MiniCPM tool parser configuration. - Checks `/health`, sends chat requests to `/v1/chat/completions`, and can request `/shutdown`. - Does not install SGLang, start a process, download model weights, or load a model on app startup. ### `models/vllm_runner.py` vLLM local server planner and OpenAI-compatible chat client. - Builds explicit `vllm serve ` command plans. - Checks `/health`, parses Prometheus-style `/metrics`, and sends chat requests to `/v1/chat/completions`. - Logs parsed benchmark metrics through `TrackingClient`. - Does not install vLLM, start a process, download model weights, or load a model on app startup. ### `models/transformers_text.py` Optional Transformers text backend. - Checks whether the `transformers` package is installed. - Lazy-loads `AutoTokenizer` and `AutoModelForCausalLM` only when the backend is selected. - Reads `trust_remote_code`, device map, dtype, max token, and temperature settings from explicit config. - Provides a simple token-list streaming helper for future Gradio streaming wiring. - Does not download model weights on startup. ### `models/service_factory.py` Creates the selected backend service for the UI. - `TEXT_SERVICE_REGISTRY` registers available text backend factories. - `VISION_SERVICE_REGISTRY` registers available vision backend factories. - `create_text_service()` chooses placeholder, llama.cpp, llama-cpp-python, Ollama, OpenAI-compatible, SGLang, or Transformers text service. - `create_vision_service()` chooses placeholder, llama.cpp, llama-cpp-python, Ollama, or Transformers MiniCPM vision service. - `backend_statuses()` reports current backend availability. - llama.cpp, llama-cpp-python, and OpenAI-compatible services read ignored local backend settings when selected. ### `ui/chat_tab.py` Builds the text chat tab. - Shows text models from the catalog. - Displays selected model metadata. - Calls the selected backend service. - Emits inference request and response events. ### `ui/vision_tab.py` Builds the vision tab. - Shows vision models from the catalog. - Accepts an image and prompt. - Calls the selected backend service. - Emits inference request and response events. ### `ui/dataset_tab.py` Local dataset preview surface. - Previews local CSV, JSONL, and NDJSON files. - Previews Hugging Face datasets when the optional external `datasets` package is installed. - Shows source, row count, columns, and sample rows. - Calculates basic local dataset statistics. - Emits dataset loaded events. Future behavior: - Serve dataset tools through the selected MCP path. ### `ui/train_tab.py` Training planning and local evaluation surface. - Builds a LoRA dry-run training plan without launching training. - Builds a non-executing LoRA trainer request with dependency status. - Shows SWIFT/LLaMA-Factory vision fine-tuning plan. - Shows checkpoint output path, validation status, and hardware notes. - Runs local base-vs-tuned evaluation from newline-separated response text. - Shows exact-match summary and a qualitative eval table. - Logs tuned evaluation reports to `data/eval_results.jsonl`. Future behavior: - Start LoRA training. - Show loss and metrics. - Write Trackio traces. ### `ui/vllm_tab.py` vLLM local serving planner. - Builds explicit `vllm serve` command plans. - Checks local vLLM `/health`. - Fetches and parses `/metrics`. - Logs vLLM benchmark metrics through local JSONL/Trackio fallback tracking. - Does not install vLLM, start a process, download models, or load weights on startup. ### `ui/export_tab.py` GGUF export planning surface. - Selects a configured model and quantization. - Shows official GGUF download command plans when the model has GGUF metadata. - Shows local HF-to-GGUF conversion and llama.cpp quantization command plans. - Lists files already present under the selected export directory. - Exposes existing exported files through a Gradio download output. - Does not execute downloads, conversion, or quantization. Future behavior: - Execute downloads and conversions after explicit user action. ### `ui/notes_tab.py` Field notes implementation. - Saves prompt, model response, correction, and tags to `data/field_notes.csv`. - Captures optional image path, video path, and a use-for-training flag. - Exports corrected notes to JSONL. - Exports local Hugging Face Dataset-style files under `data/hf_field_notes`. - Imports uncertain OCR predictions for human correction. - Exports corrected OCR rows to JSONL. - Emits field note saved events. Future behavior: - Push corrected notes to a remote Hugging Face Dataset after login. - Feed notes into fine-tuning. ### `ui/traces_tab.py` Local trace and tracking preview. - Shows manual trace event previews. - Shows recent local app events. - Shows JSONL trace rows and tracking status. - Exports local traces to `exports/traces.jsonl`. - Calls Trackio only when the optional package is installed and enabled. ### `ui/agent_tab.py` Local non-autonomous agent mode. - Drafts a research-plan-implement-verify trace. - Saves agent traces to `data/agent_traces.jsonl`. - Exports trace JSONL and local HF Dataset-style trace files. - Does not execute shell commands, commit, push, deploy, download models, or call external services. ### `ui/status_tab.py` Shows configured models and backend metadata. - Helps verify model-size compliance and backend status. - Provides local llama.cpp settings, GGUF/mmproj file pickers, and command generation. - Provides LM Studio/OpenAI-compatible base URL, optional model-name storage, and reachability check. - Provides SGLang command planning, health check, and shutdown request controls. ### `datasets/field_notes.py` Field note data model and CSV store. - `FieldNote` captures prompt, response, correction, tags, and timestamp. - `FieldNote` also captures optional image/video paths and a training inclusion flag. - `FieldNoteStore.save()` persists notes to `data/field_notes.csv`. - `FieldNoteStore.list_notes()` filters by correction, tag, and training inclusion. - `FieldNoteStore.export_jsonl()` writes training-ready JSONL. - `FieldNoteStore.export_hf_dataset()` writes local HF Dataset-style files. - `SQLiteFieldNoteStore` stores and lists notes in SQLite for larger correction loops. ### `datasets/loader.py` Dataset preview and statistics helpers. - `preview_local_dataset()` previews CSV, JSONL, and NDJSON files. - `dataset_statistics()` reports row count, column count, names, and non-empty counts. - `preview_huggingface_dataset()` optionally uses the external Hugging Face `datasets` package. ### `datasets/synthetic.py` Deterministic local synthetic data helpers. - `generate_synthetic_examples()` creates local prompt/response/correction examples. - `validate_synthetic_example()` checks schema requirements. - `quality_filter_examples()` removes incomplete or low-value examples. - `augment_examples()` creates deterministic variants for workflow testing. - `export_synthetic_jsonl()` writes JSONL without external services. ### `datasets/ocr.py` Local OCR correction helpers. - `OCRPrediction` stores source path, predicted text, confidence, and optional page. - `load_ocr_predictions()` loads local `.csv`, `.jsonl`, and `.ndjson` prediction files. - `uncertain_predictions()` filters rows at or below a confidence threshold or with empty text. - `import_uncertain_predictions()` creates Field Notes correction tasks for uncertain rows. - `export_corrected_ocr_notes()` writes corrected OCR examples to JSONL for evaluation or training. - `ocr_import_summary()` previews uncertain rows for the Field Notes tab. ### `mcp_tools/tools.py` Local MCP-style tools. - `dataset_stats_tool()` returns local dataset statistics. - `hf_dataset_preview_tool()` previews Hugging Face datasets when optional dependencies exist. - `safe_calculator_tool()` evaluates numeric arithmetic only. - `model_inference_tool()` routes text prompts through the selected model service. - `tool_registry()` returns the local tool map for a future MCP endpoint. ### `mcp_tools/vindex_tool.py` Non-executing VINDEX integration boundary. - Defines the eight VINDEX PRD methods and their local FastAPI paths. - `build_vindex_call_plan()` validates method names and builds endpoint/payload plans. - Caps `star_spread.n_neighbors` at 5 and `calibrated_edit.causal_window` at 3 based on the PRD safety notes. - `vindex_dependency_report()` checks whether the optional `vindex` package or local health endpoint is available. - `vindex_verification_report()` combines dependency status with a safe call plan and keeps execution disabled until the local VINDEX install is verified. ### `mcp_tools/bridge.py` Gradio-native MCP bridge metadata and local invocation helper. - `MCP_PATH` documents `/gradio_api/mcp/sse`. - `mcp_manifest()` returns the selected mode, path, and tool definitions. - `invoke_mcp_tool()` verifies local tool invocation by name. ### `agent/runner.py` Deterministic local agent trace runner. - `AGENT_SYSTEM_PROMPT` defines the agent behavior contract. - `run_agent_loop()` produces research, plan, implement, and verify trace steps. - `run_paper_to_code_loop()` produces paper-to-code research, plan, implement, and verify trace steps. - `default_safety_gates()` lists the non-autonomous safety requirements. - `save_agent_trace()` appends traces to JSONL. - `export_agent_traces()` exports trace JSONL. - `export_agent_traces_hf_dataset()` writes local HF Dataset-style trace files. - The runner can call safe local tools, but it is not autonomous. ### `core/file_exports.py` Shared export helper. - `copy_text_file_or_empty()` copies a text artifact to an export path or creates an empty one. ### `training/export.py` Non-executing GGUF export planning. - `detect_llama_cpp_tools()` checks `llama-server`, `llama-cli`, and `llama-quantize`. - `build_export_plan()` creates explicit download, conversion, and quantization command plans. - `list_exported_files()` lists generated/local export files. - `ExportPlan.as_dict()` marks that commands are not executed and no startup downloads occur. ### `training/evaluation.py` Local deterministic evaluation helpers. - `default_prompt_cases()` returns a small built-in prompt test set. - `load_prompt_cases()` loads prompt/expected pairs from JSONL. - `evaluate_responses()` computes exact-match rows and a qualitative table. - `perplexity_from_losses()` computes perplexity from explicit negative log likelihood values. - `compare_base_vs_tuned()` reports exact-match delta. - `log_eval_report()` appends JSONL evaluation results. ### `training/lora_trainer.py` Non-executing LoRA trainer request builder. - `lora_dependency_report()` reports PEFT, TRL, Transformers, and Torch availability. - `build_lora_training_request()` combines the training plan with dependency status and a command preview. - `vision_finetuning_plan()` documents SWIFT/LLaMA-Factory as the future MiniCPM-V fine-tuning path. - Keeps `execute_training` false until dependencies, hardware, and dataset schema are approved. ### `training/reward_eval.py` Deterministic local reward-style evaluation helpers. - `RewardEvaluator.evaluate()` scores supplied responses with transparent lexical heuristics. - `best_of_n()` selects the highest-scoring candidate without model calls. - `create_dpo_pairs()` creates chosen/rejected pairs for DPO-style datasets. - `eval_lora_vs_base()` compares base and LoRA response rewards. ### `training/planner.py` Non-executing LoRA training planner. - `load_training_config()` reads LoRA and training settings from `config/training.yaml`. - `build_training_plan()` creates a dry-run plan with checkpoint output path. - `validate_training_plan()` checks dataset existence and numeric training settings. - `training_hardware_notes()` documents practical local hardware expectations. ### `tracking/trackio_client.py` Tracking client with JSONL fallback. - `load_tracking_config()` reads Trackio settings from `config/training.yaml`. - `TrackingClient.init()` starts Trackio only when enabled and installed. - `TrackingClient.log()` always writes local JSONL and optionally forwards to Trackio. - `TrackingClient.finish()` closes optional Trackio state. - `export_traces()` copies local traces to `exports/traces.jsonl`. - `read_trace_rows()` returns recent local trace rows for the UI. ### `core/events.py` Small event bus reserved for future cross-module events. - `EventType` names app events. - `UI_ERROR` records visible tab-level failures. - `Event` carries event data. - `EventBus` registers handlers and emits events. ### `core/app_state.py` Shared local app state. - `AppState.emit()` records events, logs them, and dispatches them through `EventBus`. - `AppState.emit()` also writes trace events through `TrackingClient`. - `AppState.recent_events()` returns local trace previews for the Traces tab. - `emit_inference_response()` records shared response metadata. ### `core/tab_feedback.py` Formats tab status text and emits `ui_error` events for visible tab-level failures. ### `ui/progress.py` Defines the shared Gradio progress mode used by tab button callbacks. ### `core/app_logging.py` Lightweight logging setup. - `configure_app_logging()` configures compact process logging once. ### `core/registry.py` Generic registry helper. - `Registry.register(name, item)` stores a service. - `Registry.get(name)` retrieves a service. - `Registry.list()` lists registered services. ## Current Design Rule The app must not download model weights on startup. Model loading should happen only after the user chooses a backend/model and clicks an explicit action.