lesson-agent-dev / README.md
MSG
Feat/last sprintos (#23)
28543d3
|
Raw
History Blame Contribute Delete
10.5 kB
---
title: Lesson Agent
emoji: πŸ“š
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: "6.16.0"
app_file: app.py
python_version: "3.12"
pinned: false
license: apache-2.0
tags:
- track:backyard
- track:wood
- sponsor:openbmb
- sponsor:openai
- sponsor:nvidia
- sponsor:modal
- achievement:offgrid
- achievement:welltuned
- achievement:offbrand
- achievement:llama
- achievement:sharing
- achievement:fieldnotes
- build-small-hackathon
- backyard-ai
- modal
- tiny-titan
- best-agent
- best-demo
- openbmb
- sharing-is-caring
- off-the-grid
- off-brand
- field-notes
- well-tuned
- llama-champion
---
# Lesson Agent
**Backyard AI** Gradio Space for the [Build Small Hackathon](https://huggingface.co/build-small-hackathon).
A local skill-based agent helps a teacher you know turn a **topic + grade level** into a downloadable **PowerPoint** β€” powered by a small transformers model (`MiniCPM5-1B` by default), no cloud LLM API.
See **[USAGE.md](USAGE.md)** for local run, Gradio SDK / ZeroGPU Space deployment, and Docker (later).
**Demo video:** [https://www.youtube.com/watch?v=bwtOiZvJ-7k](https://www.youtube.com/watch?v=bwtOiZvJ-7k)
**Blog post:** [Small Models, Bounded Jobs](https://huggingface.co/blog/build-small-hackathon/lessonagent-opennotebook) β€” Hugging Face Build Small Hackathon write-up
**X post:** [https://x.com/MSG_Encrypted/status/2066570320861921748](https://x.com/MSG_Encrypted/status/2066570320861921748)
**Github:** [https://github.com/MSghais/small-model-hackathon/](https://github.com/MSghais/small-model-hackathon/)
## Prerequisites
- [uv](https://docs.astral.sh/uv/)
- Python 3.12
## Quick start
```bash
uv sync --all-packages
cp .env.example .env # optional: edit model settings
# Run Gradio locally
uv run --package gradio-space python -m gradio_space.app
```
Open [http://localhost:7860](http://localhost:7860).
- **Lesson slides** β€” topic, grade, slide count β†’ downloadable PowerPoint
- **Research Agent** β€” scrape/index sources into MemRAG, then ask questions offline with citations
### Studio UI (Off Brand track)
The default landing page is a **custom AI Studio workspace** at `/` β€” not default Gradio chrome. It uses **Gradio 6 Server mode** (`gradio.Server`): Material 3 layout, sidebar + workspace (Research β†’ Slides β†’ Language lessons), and `@server.api` endpoints wired to the same Python backends as Classic.
- **`/`** β€” Studio UI (ingest sources, generate slides, **Language lessons** multilingual coach)
- **`/classic`** β€” full Gradio Blocks app (TeacherVoice, EchoCoach pitch analysis, settings, Chat debug)
See [apps/gradio-space/README.md](apps/gradio-space/README.md) for API names and a 2-minute judge demo script.
### Modal + Fine-tuning track (Well-Tuned)
Cloud GPU **train β†’ eval β†’ gate β†’ publish** for a skill-matrix of QLoRA adapters on `openbmb/MiniCPM5-1B` β€” no local CUDA required. Each job in [`research/modal/experiments.yaml`](research/modal/experiments.yaml) (math, science, coding, reasoning, teaching, …) fine-tunes with [`research/finetune.py`](research/finetune.py), benchmarks with `slm-lm-eval`, gates on per-skill `goals`, and publishes passing adapters to the Hub.
- **Modal (partner track)** β€” `modal run` / warm GPU worker, Volume artifacts, optional [Modal Notebook](research/notebook/minicpm5-modal-finetune.ipynb)
- **Well-Tuned badge** β€” before/after lm-eval per skill + gated Hub publish (`MSGEncrypted/minicpm5-1b-<skill>-lora`)
Full runbook: [`research/modal/README.md`](research/modal/README.md) Β· agent loop: [`research/modal/SERVER.md`](research/modal/SERVER.md) Β· local research overview: [`research/USAGE.md`](research/USAGE.md)
```bash
uv sync --group modal
modal setup && modal secret create huggingface HF_TOKEN=<token>
modal run research/modal/server_app.py --ping # health check
modal run research/modal/server_app.py --job math-lora --max-steps 20 --no-publish # cheap smoke
modal run research/modal/server_app.py --pipeline # full sweep: baselines β†’ train β†’ eval β†’ gate β†’ publish
```
Pull a passing adapter into the Space: `modal volume get slm-finetune math-lora ./models/finetuned/minicpm5-1b-lora`, then set `ACTIVE_MODEL=minicpm5-1b-lesson-lora`.
### Llama track (Llama Champion + Off-the-Grid)
The same OpenBMB **MiniCPM-V 4.6** model runs on **llama.cpp** via the [`minicpm-v-4.6-gguf`](models.yaml) preset β€” GGUF weights from [`openbmb/MiniCPM-V-4.6-gguf`](https://huggingface.co/openbmb/MiniCPM-V-4.6-gguf) (~529 MB Q4_K_M). No cloud LLM API; inference stays fully local through [`libs/inference/src/inference/llama_cpp.py`](libs/inference/src/inference/llama_cpp.py).
| Preset | Backend | Use case |
| ------ | ------- | -------- |
| `minicpm-v-4.6` | transformers | Full VLM (image/video) via Hugging Face |
| `minicpm-v-4.6-gguf` | llama.cpp | **Llama Champion** badge; text on all tabs today |
**Space (judges):** pin the GGUF preset β€” no runtime switching for visitors.
```bash
ACTIVE_MODEL=minicpm-v-4.6-gguf
ALLOW_MODEL_SWITCH=false
```
**Local dev:** switch backends at runtime without restarting.
```bash
ALLOW_MODEL_SWITCH=true
ACTIVE_MODEL=minicpm-v-4.6 # transformers startup default
# Settings or Chat β†’ select minicpm-v-4.6-gguf for llama.cpp
```
Prefetch weights (optional):
```bash
uv run python scripts/download_model.py --preset minicpm-v-4.6-gguf
```
See [USAGE.md](USAGE.md) (section *Switching models locally*) for Classic and Studio UI details.
## How it works
1. **Skill** β€” `skills/education-pptx/SKILL.md` (Hermes / agentskills.io format)
2. **LLM** β€” local model drafts a JSON slide outline
3. **Tool** β€” `create_pptx` builds the file with `python-pptx`
4. **Trace** β€” JSON log saved under `outputs/traces/` for the Sharing is Caring badge
```text
apps/gradio-space/ # Gradio tabs (Lesson slides, Research Agent, Chat debug)
libs/agent/ # Skill agent runner, tools, trace recorder
libs/researchmind/ # Scraper, chunk/embed, MemRAG SQLite store, retrieval
libs/inference/ # Transformers + llama.cpp backends
skills/ # SKILL.md + references/ + scripts/ per task
research/ # Fine-tune and agentic evals (optional)
```
### ResearchMind (offline after ingest)
1. **Skills** β€” `skills/scrape-web`, `scrape-pdf`, `extract-content`, `research-mind`
2. **Ingest** β€” URL/PDF/DOCX or topic β†’ (optional LLM URL suggest + confirm, or auto search) β†’ chunk + embed β†’ SQLite
3. **Q&A** β€” local model + retrieved chunks with `[n]` citations (no network at chat time)
4. **Memory** β€” persists under `RESEARCHMIND_DATA_DIR` (default `outputs/researchmind`)
Optional research tooling (not required for the Space): see [research/USAGE.md](research/USAGE.md).
## Environment variables
| Variable | Default | Description |
| -------- | ------- | ----------- |
| `ACTIVE_MODEL` | `minicpm5-1b` | Preset key from `models.yaml` (use `minicpm-v-4.6-gguf` for Llama track) |
| `ALLOW_MODEL_SWITCH` | `false` | Set `true` locally to switch presets in Settings / Chat |
| `AGENT_OUTPUTS_DIR` | `/tmp/agent_outputs` | Generated `.pptx` files |
| `AGENT_TRACES_DIR` | `outputs/traces` | Agent trace JSON |
| `SKILLS_DIR` | `./skills` | Skill definitions root |
| `RESEARCHMIND_DATA_DIR` | `outputs/researchmind` | MemRAG DB and raw snapshots |
| `RESEARCHMIND_EMBED_MODEL` | `all-MiniLM-L6-v2` | Sentence embedding model |
| `RESEARCHMIND_AUTO_SEARCH` | `false` | Default auto DuckDuckGo ingest |
See [`.env.example`](.env.example) and [`models.yaml`](models.yaml) for model presets.
## Hugging Face Space deployment
1. Create a Space under [build-small-hackathon](https://huggingface.co/build-small-hackathon) with **Gradio** SDK (Blank template).
2. Link this repository β€” HF builds from root `app.py` + `requirements.txt` (README YAML above).
3. Hardware: **ZeroGPU** for burst GPU inference, or **GPU basic** for always-on GPU.
4. Set `ACTIVE_MODEL=minicpm5-1b` (or `minicpm-v-4.6-gguf` for [Llama track](#llama-track-llama-champion--off-the-grid)), `ALLOW_MODEL_SWITCH=false`, `RESEARCHMIND_DATA_DIR=/tmp/researchmind`.
A root `Dockerfile` is kept for a later **Docker SDK** deploy (flip README to `sdk: docker`). See [USAGE.md](USAGE.md).
## Hackathon tracks & checklist
| Track | What we ship |
| ----- | ------------ |
| **Backyard AI** (primary) | Lesson slide builder for a teacher you know β€” topic + grade β†’ downloadable `.pptx` |
| **Off Brand** | Custom Studio UI at `/` (Gradio 6 Server mode, not default Gradio chrome) |
| **Modal** (partner) | GPU `train β†’ eval β†’ gate β†’ publish` on [Modal](https://modal.com) β€” [`research/modal/`](research/modal/) |
| **Well-Tuned** (finetuning) | Skill-matrix QLoRA adapters on MiniCPM5-1B, lm-eval gates, Hub publish |
| **Llama Champion** | `minicpm-v-4.6-gguf` on llama.cpp β€” same OpenBMB VLM family, local GGUF inference |
- Space live under build-small-hackathon
- Demo video: [YouTube](https://www.youtube.com/watch?v=bwtOiZvJ-7k) β€” real user enters topic β†’ download `.pptx` β†’ show agent trace
- Blog post: [Small Models, Bounded Jobs](https://huggingface.co/blog/build-small-hackathon/lessonagent-opennotebook)
- Social post published: [X](https://x.com/MSG_Encrypted/status/2066570320861921748)
- Submission by **June 15, 2026**
### Badge targets
- **Best Agent** β€” skill loop + `create_pptx` tool
- **Tiny Titan** β€” MiniCPM5 1B (≀4B)
- **OpenBMB** β€” `openbmb/MiniCPM5-1B`
- **Sharing is Caring** β€” upload traces with `scripts/upload_trace.py`
- **Off-the-Grid** β€” local inference only (no cloud LLM API)
- **Llama Champion** β€” llama.cpp backend with [`openbmb/MiniCPM-V-4.6-gguf`](https://huggingface.co/openbmb/MiniCPM-V-4.6-gguf); see [Llama track](#llama-track-llama-champion--off-the-grid)
- **Well-Tuned** β€” per-skill QLoRA adapters trained + gated + published via the [Modal + Fine-tuning track](#modal--fine-tuning-track-well-tuned)
- **Modal** β€” same pipeline; see [`research/modal/README.md`](research/modal/README.md)
## Agent trace upload
```bash
uv run python scripts/upload_trace.py --repo-id YOUR_USER/build-small-agent-traces
```
## Demo video script
1. Introduce the teacher and the problem (building a 5-slide lesson takes 30+ minutes).
2. Open **Lesson slides**, enter topic + grade, click **Generate**.
3. Show outline preview and download the `.pptx`.
4. Expand the agent trace JSON β€” local model, no cloud API.