Spaces:

build-small-hackathon
/

workbench

Running on Zero

App Files Files Community

workbench / docs /ARCHITECTURE.md

GitHub Actions

Initial ZeroGPU deployment with spaces shim

7f9dfed 15 days ago

preview code

Raw

History Blame Contribute Delete

21.7 kB

	# Architecture

	The project is intentionally small at first. The PRD describes a large workbench; this repo starts
	with the smallest version that can grow into it.

	## High-Level Flow

	```text
	app.py
	loads config/models.yaml
	configures lightweight logging
	builds Gradio tabs
	passes model catalog to UI modules

	ui/*
	defines each Gradio tab
	calls service classes
	emits local app events for inference, datasets, and field notes
	uses shared progress settings for callback loading indicators

	agent/*
	holds deterministic local agent planning and trace export helpers

	models/*
	holds model catalog, local backend config, and inference services

	datasets/*
	stores dataset, synthetic data, and correction-loop helpers

	mcp_tools/*
	holds local tool functions, VINDEX call planning, and Gradio-native MCP bridge metadata

	config/*
	holds model and training settings

	training/*
	holds non-executing training, LoRA request, evaluation, and export planning helpers

	tracking/*
	holds local JSONL tracing and optional Trackio integration

	deployment/*
	holds Hugging Face Space deployment planning and validation helpers

	plant/*
	holds the first reference domain app built from the template
	can run standalone with python -m plant.app --no-model
	keeps heavy model dependencies optional

	core/*
	shared app state, event, logging, and registry helpers
	```

	## Files And Classes

	### `app.py`

	Builds and launches the Gradio app.

	- `build_app()` creates the Gradio `Blocks` app.
	- Loads the model catalog from `config/models.yaml`.
	- Registers the current UI tabs.
	- `APP_CSS` defines compact responsive layout rules for app width, mobile padding, scrollable tabs,
	and button touch targets.

	### `plant/app.py`

	Standalone Plant Discovery reference app built from the template.

	- `build_app(no_model=True)` creates a Gradio app without loading model weights.
	- Loads `plant/models.yaml`.
	- Builds a local species index.
	- Reuses `datasets.field_notes.FieldNoteStore` for corrections.
	- Uses `DemoPlantVisionService` for screenshots/tests or `PlantVisionService` for OpenBMB
	MiniCPM-V zero-shot and fine-tuned adapter inference.

	### `plant/plant_service.py`

	Domain service and schema for Plant Discovery.

	- `PlantID` is the structured output schema.
	- `DemoPlantVisionService` provides deterministic no-model results.
	- `PlantVisionService` lazy-loads optional MiniCPM-V dependencies only during identification.
	- `PlantVisionService.from_config(..., "plant_vlm_finetuned")` can load a PEFT adapter after a real
	adapter repo is configured.
	- `extract_json_object()` and `parse_plant_response()` make model JSON output testable.

	### `plant/training.py`

	Non-executing training planner for Plant Discovery.

	- `build_plant_training_plan()` returns SWIFT and LLaMA-Factory command previews.
	- `plant_training_dependency_report()` reports optional training dependency availability.
	- `write_llamafactory_dataset_info()` writes a dataset-info preview for LLaMA-Factory workflows.
	- Training is never started by the Gradio UI or script.

	### `plant/plant_loader.py`

	Domain data and export helpers for Plant Discovery.

	- `PlantRecord` normalizes plant examples into training rows.
	- `LocalFolderLoader` maps species folders to image metadata.
	- `SpeciesIndexBuilder` builds a no-network species index with demo fallback.
	- `FieldNotesPlantExporter` exports corrected field notes to plant training JSONL.

	### `plant/plant_tab.py`

	Focused Gradio UI for Plant Discovery.

	- Identify tab uploads images and renders a safe escaped result card.
	- Field Guide tab searches the species index.
	- Corrections tab saves and exports training-ready corrections.
	- Stats tab summarizes species and correction counts.
	- Training is represented as a non-executing plan, not a subprocess.

	### `plant/plant_tools.py`

	Optional local/MCP tools for Plant Discovery.

	- Pure functions can be tested without an MCP server.
	- `build_mcp_server()` imports `mcp` only when explicitly requested.
	- Tools expose identify, species search, correction save/export, stats, and training plan.

	### `models/model_catalog.py`

	Reads model configuration and turns it into typed Python objects.

	- `ModelInfo` describes one configured model.
	- `load_model_catalog(path)` reads YAML and returns all configured models.
	- `model_choices(catalog, model_type)` filters models for a UI dropdown.
	- `model_summary(model)` returns display metadata for the Gradio JSON panel.
	- `backend_capabilities` maps each model to supported local backend capabilities.

	### `models/placeholder_service.py`

	Deterministic placeholder model service used before real inference is wired.

	- `PlaceholderModelService.chat()` returns a deterministic text response.
	- `PlaceholderModelService.vision_chat()` returns a deterministic image/prompt response.

	This file should be replaced or complemented by real services such as:

	- `ollama_service.py`
	- `llama_cpp_service.py`
	- `openai_compatible_service.py`
	- `sglang_runner.py`
	- `minicpm_vision.py`
	- `transformers_text.py`
	- `sglang_service.py`

	### `models/base.py`

	Defines service contracts and backend status records.

	- `BackendStatus` describes whether a backend is available.
	- `TextModelService` is the text chat protocol.
	- `VisionModelService` is the vision chat protocol.

	### `models/ollama_service.py`

	Ollama-backed local inference client.

	- Checks whether `ollama` is installed and reachable.
	- Sends text and vision chat requests to `http://127.0.0.1:11434/api/chat`.
	- Lists locally available Ollama models through `/api/tags`.
	- Builds explicit `ollama pull <model>` commands for the Status tab.
	- Does not pull or download models automatically.

	### `models/llama_cpp_service.py`

	llama.cpp HTTP client for local GGUF inference.

	- Checks whether `llama-server` is installed and reachable.
	- Builds explicit `llama-server -m <model.gguf>` commands.
	- Supports `--mmproj <mmproj.gguf>` command metadata for multimodal models.
	- Sends text chat requests to `/v1/chat/completions`.
	- Does not download GGUF files or start background servers automatically.

	### `models/local_backend_config.py`

	User-local backend settings stored under ignored `data/local_backends.yaml`.

	- `LocalBackendConfig` stores llama.cpp server URL, OpenAI-compatible base URL, optional served
	model name, GGUF path, mmproj path, context length, and GPU layers.
	- `save_local_backend_config()` writes local-only settings without touching tracked model config.
	- `build_llama_server_command()` returns the explicit command the user can run.
	- `local_backend_summary()` reports file status and confirms no startup downloads or automatic model loads.

	### `models/openai_compatible_service.py`

	Local OpenAI-compatible chat client for LM Studio, vLLM-style servers, or similar local endpoints.

	- Checks `/v1/models` for reachability.
	- Sends text chat requests to `/v1/chat/completions`.
	- Supports an optional served-model-name override for tools such as LM Studio.
	- Returns visible unavailable/request-failed messages instead of crashing the Gradio callback.
	- Does not call cloud APIs or download model weights.

	### `models/llama_cpp_python_service.py`

	Optional direct Python binding backend for GGUF inference.

	- Checks whether `llama_cpp` is importable.
	- Requires an explicit local GGUF path.
	- Does not download model files.
	- Provides text chat through `Llama.create_chat_completion()`.
	- Vision support remains routed through llama-server until mmproj/image serialization is wired.

	### `models/minicpm_vision.py`

	Optional MiniCPM vision backend.

	- Checks whether the `transformers` package is available.
	- Lazy-loads `AutoProcessor` and `AutoModelForImageTextToText` only when selected.
	- Formats image/text messages for image-text-to-text generation.
	- Maps thinking mode into the prompt template.
	- Provides a video support plan for future local frame sampling.

	### `models/sglang_runner.py`

	SGLang local server planner and OpenAI-compatible chat client.

	- Builds an explicit `python -m sglang.launch_server` command.
	- Includes MiniCPM tool parser configuration.
	- Checks `/health`, sends chat requests to `/v1/chat/completions`, and can request `/shutdown`.
	- Does not install SGLang, start a process, download model weights, or load a model on app startup.

	### `models/vllm_runner.py`

	vLLM local server planner and OpenAI-compatible chat client.

	- Builds explicit `vllm serve <model>` command plans.
	- Checks `/health`, parses Prometheus-style `/metrics`, and sends chat requests to
	`/v1/chat/completions`.
	- Logs parsed benchmark metrics through `TrackingClient`.
	- Does not install vLLM, start a process, download model weights, or load a model on app startup.

	### `models/transformers_text.py`

	Optional Transformers text backend.

	- Checks whether the `transformers` package is installed.
	- Lazy-loads `AutoTokenizer` and `AutoModelForCausalLM` only when the backend is selected.
	- Reads `trust_remote_code`, device map, dtype, max token, and temperature settings from explicit config.
	- Provides a simple token-list streaming helper for future Gradio streaming wiring.
	- Does not download model weights on startup.

	### `models/service_factory.py`

	Creates the selected backend service for the UI.

	- `TEXT_SERVICE_REGISTRY` registers available text backend factories.
	- `VISION_SERVICE_REGISTRY` registers available vision backend factories.
	- `create_text_service()` chooses placeholder, llama.cpp, llama-cpp-python, Ollama,
	OpenAI-compatible, SGLang, or Transformers text service.
	- `create_vision_service()` chooses placeholder, llama.cpp, llama-cpp-python, Ollama, or
	Transformers MiniCPM vision service.
	- `backend_statuses()` reports current backend availability.
	- llama.cpp, llama-cpp-python, and OpenAI-compatible services read ignored local backend settings
	when selected.

	### `ui/chat_tab.py`

	Builds the text chat tab.

	- Shows text models from the catalog.
	- Displays selected model metadata.
	- Calls the selected backend service.
	- Emits inference request and response events.

	### `ui/vision_tab.py`

	Builds the vision tab.

	- Shows vision models from the catalog.
	- Accepts an image and prompt.
	- Calls the selected backend service.
	- Emits inference request and response events.

	### `ui/dataset_tab.py`

	Local dataset preview surface.

	- Previews local CSV, JSONL, and NDJSON files.
	- Previews Hugging Face datasets when the optional external `datasets` package is installed.
	- Shows source, row count, columns, and sample rows.
	- Calculates basic local dataset statistics.
	- Emits dataset loaded events.

	Future behavior:

	- Serve dataset tools through the selected MCP path.

	### `ui/train_tab.py`

	Training planning and local evaluation surface.

	- Builds a LoRA dry-run training plan without launching training.
	- Builds a non-executing LoRA trainer request with dependency status.
	- Shows SWIFT/LLaMA-Factory vision fine-tuning plan.
	- Shows checkpoint output path, validation status, and hardware notes.
	- Runs local base-vs-tuned evaluation from newline-separated response text.
	- Shows exact-match summary and a qualitative eval table.
	- Logs tuned evaluation reports to `data/eval_results.jsonl`.

	Future behavior:

	- Start LoRA training.
	- Show loss and metrics.
	- Write Trackio traces.

	### `ui/vllm_tab.py`

	vLLM local serving planner.

	- Builds explicit `vllm serve` command plans.
	- Checks local vLLM `/health`.
	- Fetches and parses `/metrics`.
	- Logs vLLM benchmark metrics through local JSONL/Trackio fallback tracking.
	- Does not install vLLM, start a process, download models, or load weights on startup.

	### `ui/export_tab.py`

	GGUF export planning surface.

	- Selects a configured model and quantization.
	- Shows official GGUF download command plans when the model has GGUF metadata.
	- Shows local HF-to-GGUF conversion and llama.cpp quantization command plans.
	- Lists files already present under the selected export directory.
	- Exposes existing exported files through a Gradio download output.
	- Does not execute downloads, conversion, or quantization.

	Future behavior:

	- Execute downloads and conversions after explicit user action.

	### `ui/notes_tab.py`

	Field notes implementation.

	- Saves prompt, model response, correction, and tags to `data/field_notes.csv`.
	- Captures optional image path, video path, and a use-for-training flag.
	- Exports corrected notes to JSONL.
	- Exports local Hugging Face Dataset-style files under `data/hf_field_notes`.
	- Imports uncertain OCR predictions for human correction.
	- Exports corrected OCR rows to JSONL.
	- Emits field note saved events.

	Future behavior:

	- Push corrected notes to a remote Hugging Face Dataset after login.
	- Feed notes into fine-tuning.

	### `ui/traces_tab.py`

	Local trace and tracking preview.

	- Shows manual trace event previews.
	- Shows recent local app events.
	- Shows JSONL trace rows and tracking status.
	- Exports local traces to `exports/traces.jsonl`.
	- Calls Trackio only when the optional package is installed and enabled.

	### `ui/agent_tab.py`

	Local non-autonomous agent mode.

	- Drafts a research-plan-implement-verify trace.
	- Saves agent traces to `data/agent_traces.jsonl`.
	- Exports trace JSONL and local HF Dataset-style trace files.
	- Does not execute shell commands, commit, push, deploy, download models, or call external services.

	### `ui/status_tab.py`

	Shows configured models and backend metadata.

	- Helps verify model-size compliance and backend status.
	- Provides local llama.cpp settings, GGUF/mmproj file pickers, and command generation.
	- Provides LM Studio/OpenAI-compatible base URL, optional model-name storage, and reachability check.
	- Provides SGLang command planning, health check, and shutdown request controls.

	### `datasets/field_notes.py`

	Field note data model and CSV store.

	- `FieldNote` captures prompt, response, correction, tags, and timestamp.
	- `FieldNote` also captures optional image/video paths and a training inclusion flag.
	- `FieldNoteStore.save()` persists notes to `data/field_notes.csv`.
	- `FieldNoteStore.list_notes()` filters by correction, tag, and training inclusion.
	- `FieldNoteStore.export_jsonl()` writes training-ready JSONL.
	- `FieldNoteStore.export_hf_dataset()` writes local HF Dataset-style files.
	- `SQLiteFieldNoteStore` stores and lists notes in SQLite for larger correction loops.

	### `datasets/loader.py`

	Dataset preview and statistics helpers.

	- `preview_local_dataset()` previews CSV, JSONL, and NDJSON files.
	- `dataset_statistics()` reports row count, column count, names, and non-empty counts.
	- `preview_huggingface_dataset()` optionally uses the external Hugging Face `datasets` package.

	### `datasets/synthetic.py`

	Deterministic local synthetic data helpers.

	- `generate_synthetic_examples()` creates local prompt/response/correction examples.
	- `validate_synthetic_example()` checks schema requirements.
	- `quality_filter_examples()` removes incomplete or low-value examples.
	- `augment_examples()` creates deterministic variants for workflow testing.
	- `export_synthetic_jsonl()` writes JSONL without external services.

	### `datasets/ocr.py`

	Local OCR correction helpers.

	- `OCRPrediction` stores source path, predicted text, confidence, and optional page.
	- `load_ocr_predictions()` loads local `.csv`, `.jsonl`, and `.ndjson` prediction files.
	- `uncertain_predictions()` filters rows at or below a confidence threshold or with empty text.
	- `import_uncertain_predictions()` creates Field Notes correction tasks for uncertain rows.
	- `export_corrected_ocr_notes()` writes corrected OCR examples to JSONL for evaluation or training.
	- `ocr_import_summary()` previews uncertain rows for the Field Notes tab.

	### `mcp_tools/tools.py`

	Local MCP-style tools.

	- `dataset_stats_tool()` returns local dataset statistics.
	- `hf_dataset_preview_tool()` previews Hugging Face datasets when optional dependencies exist.
	- `safe_calculator_tool()` evaluates numeric arithmetic only.
	- `model_inference_tool()` routes text prompts through the selected model service.
	- `tool_registry()` returns the local tool map for a future MCP endpoint.

	### `mcp_tools/vindex_tool.py`

	Non-executing VINDEX integration boundary.

	- Defines the eight VINDEX PRD methods and their local FastAPI paths.
	- `build_vindex_call_plan()` validates method names and builds endpoint/payload plans.
	- Caps `star_spread.n_neighbors` at 5 and `calibrated_edit.causal_window` at 3 based on the PRD
	safety notes.
	- `vindex_dependency_report()` checks whether the optional `vindex` package or local health
	endpoint is available.
	- `vindex_verification_report()` combines dependency status with a safe call plan and keeps
	execution disabled until the local VINDEX install is verified.

	### `mcp_tools/bridge.py`

	Gradio-native MCP bridge metadata and local invocation helper.

	- `MCP_PATH` documents `/gradio_api/mcp/sse`.
	- `mcp_manifest()` returns the selected mode, path, and tool definitions.
	- `invoke_mcp_tool()` verifies local tool invocation by name.

	### `agent/runner.py`

	Deterministic local agent trace runner.

	- `AGENT_SYSTEM_PROMPT` defines the agent behavior contract.
	- `run_agent_loop()` produces research, plan, implement, and verify trace steps.
	- `run_paper_to_code_loop()` produces paper-to-code research, plan, implement, and verify trace steps.
	- `default_safety_gates()` lists the non-autonomous safety requirements.
	- `save_agent_trace()` appends traces to JSONL.
	- `export_agent_traces()` exports trace JSONL.
	- `export_agent_traces_hf_dataset()` writes local HF Dataset-style trace files.
	- The runner can call safe local tools, but it is not autonomous.

	### `core/file_exports.py`

	Shared export helper.

	- `copy_text_file_or_empty()` copies a text artifact to an export path or creates an empty one.

	### `training/export.py`

	Non-executing GGUF export planning.

	- `detect_llama_cpp_tools()` checks `llama-server`, `llama-cli`, and `llama-quantize`.
	- `build_export_plan()` creates explicit download, conversion, and quantization command plans.
	- `list_exported_files()` lists generated/local export files.
	- `ExportPlan.as_dict()` marks that commands are not executed and no startup downloads occur.

	### `training/evaluation.py`

	Local deterministic evaluation helpers.

	- `default_prompt_cases()` returns a small built-in prompt test set.
	- `load_prompt_cases()` loads prompt/expected pairs from JSONL.
	- `evaluate_responses()` computes exact-match rows and a qualitative table.
	- `perplexity_from_losses()` computes perplexity from explicit negative log likelihood values.
	- `compare_base_vs_tuned()` reports exact-match delta.
	- `log_eval_report()` appends JSONL evaluation results.

	### `training/lora_trainer.py`

	Non-executing LoRA trainer request builder.

	- `lora_dependency_report()` reports PEFT, TRL, Transformers, and Torch availability.
	- `build_lora_training_request()` combines the training plan with dependency status and a command
	preview.
	- `vision_finetuning_plan()` documents SWIFT/LLaMA-Factory as the future MiniCPM-V fine-tuning path.
	- Keeps `execute_training` false until dependencies, hardware, and dataset schema are approved.

	### `training/reward_eval.py`

	Deterministic local reward-style evaluation helpers.

	- `RewardEvaluator.evaluate()` scores supplied responses with transparent lexical heuristics.
	- `best_of_n()` selects the highest-scoring candidate without model calls.
	- `create_dpo_pairs()` creates chosen/rejected pairs for DPO-style datasets.
	- `eval_lora_vs_base()` compares base and LoRA response rewards.

	### `training/planner.py`

	Non-executing LoRA training planner.

	- `load_training_config()` reads LoRA and training settings from `config/training.yaml`.
	- `build_training_plan()` creates a dry-run plan with checkpoint output path.
	- `validate_training_plan()` checks dataset existence and numeric training settings.
	- `training_hardware_notes()` documents practical local hardware expectations.

	### `tracking/trackio_client.py`

	Tracking client with JSONL fallback.

	- `load_tracking_config()` reads Trackio settings from `config/training.yaml`.
	- `TrackingClient.init()` starts Trackio only when enabled and installed.
	- `TrackingClient.log()` always writes local JSONL and optionally forwards to Trackio.
	- `TrackingClient.finish()` closes optional Trackio state.
	- `export_traces()` copies local traces to `exports/traces.jsonl`.
	- `read_trace_rows()` returns recent local trace rows for the UI.

	### `core/events.py`

	Small event bus reserved for future cross-module events.

	- `EventType` names app events.
	- `UI_ERROR` records visible tab-level failures.
	- `Event` carries event data.
	- `EventBus` registers handlers and emits events.

	### `core/app_state.py`

	Shared local app state.

	- `AppState.emit()` records events, logs them, and dispatches them through `EventBus`.
	- `AppState.emit()` also writes trace events through `TrackingClient`.
	- `AppState.recent_events()` returns local trace previews for the Traces tab.
	- `emit_inference_response()` records shared response metadata.

	### `core/tab_feedback.py`

	Formats tab status text and emits `ui_error` events for visible tab-level failures.

	### `ui/progress.py`

	Defines the shared Gradio progress mode used by tab button callbacks.

	### `core/app_logging.py`

	Lightweight logging setup.

	- `configure_app_logging()` configures compact process logging once.

	### `core/registry.py`

	Generic registry helper.

	- `Registry.register(name, item)` stores a service.
	- `Registry.get(name)` retrieves a service.
	- `Registry.list()` lists registered services.

	## Current Design Rule

	The app must not download model weights on startup. Model loading should happen only after the
	user chooses a backend/model and clicks an explicit action.