Spaces:

MSGEncrypted
/

lesson-agent-dev

Sleeping

App Files Files Community

lesson-agent-dev / README.md

MSG

Feat/last sprintos (#23)

28543d3 10 days ago

preview code

Raw

History Blame Contribute Delete

10.5 kB

	---
	title: Lesson Agent
	emoji: 📚
	colorFrom: blue
	colorTo: green
	sdk: gradio
	sdk_version: "6.16.0"
	app_file: app.py
	python_version: "3.12"
	pinned: false
	license: apache-2.0
	tags:
	- track:backyard
	- track:wood
	- sponsor:openbmb
	- sponsor:openai
	- sponsor:nvidia
	- sponsor:modal
	- achievement:offgrid
	- achievement:welltuned
	- achievement:offbrand
	- achievement:llama
	- achievement:sharing
	- achievement:fieldnotes
	- build-small-hackathon
	- backyard-ai
	- modal
	- tiny-titan
	- best-agent
	- best-demo
	- openbmb
	- sharing-is-caring
	- off-the-grid
	- off-brand
	- field-notes
	- well-tuned
	- llama-champion

	---



	# Lesson Agent

	Backyard AI Gradio Space for the [Build Small Hackathon](https://huggingface.co/build-small-hackathon).

	A local skill-based agent helps a teacher you know turn a topic + grade level into a downloadable PowerPoint — powered by a small transformers model (`MiniCPM5-1B` by default), no cloud LLM API.

	See [USAGE.md](USAGE.md) for local run, Gradio SDK / ZeroGPU Space deployment, and Docker (later).

	Demo video: [https://www.youtube.com/watch?v=bwtOiZvJ-7k](https://www.youtube.com/watch?v=bwtOiZvJ-7k)

	Blog post: [Small Models, Bounded Jobs](https://huggingface.co/blog/build-small-hackathon/lessonagent-opennotebook) — Hugging Face Build Small Hackathon write-up

	X post: [https://x.com/MSG_Encrypted/status/2066570320861921748](https://x.com/MSG_Encrypted/status/2066570320861921748)


	Github: [https://github.com/MSghais/small-model-hackathon/](https://github.com/MSghais/small-model-hackathon/)

	## Prerequisites

	- [uv](https://docs.astral.sh/uv/)
	- Python 3.12

	## Quick start

	```bash
	uv sync --all-packages
	cp .env.example .env # optional: edit model settings

	# Run Gradio locally
	uv run --package gradio-space python -m gradio_space.app
	```

	Open [http://localhost:7860](http://localhost:7860).

	- Lesson slides — topic, grade, slide count → downloadable PowerPoint
	- Research Agent — scrape/index sources into MemRAG, then ask questions offline with citations

	### Studio UI (Off Brand track)

	The default landing page is a custom AI Studio workspace at `/` — not default Gradio chrome. It uses Gradio 6 Server mode (`gradio.Server`): Material 3 layout, sidebar + workspace (Research → Slides → Language lessons), and `@server.api` endpoints wired to the same Python backends as Classic.

	- `/` — Studio UI (ingest sources, generate slides, Language lessons multilingual coach)
	- `/classic` — full Gradio Blocks app (TeacherVoice, EchoCoach pitch analysis, settings, Chat debug)

	See [apps/gradio-space/README.md](apps/gradio-space/README.md) for API names and a 2-minute judge demo script.

	### Modal + Fine-tuning track (Well-Tuned)

	Cloud GPU train → eval → gate → publish for a skill-matrix of QLoRA adapters on `openbmb/MiniCPM5-1B` — no local CUDA required. Each job in [`research/modal/experiments.yaml`](research/modal/experiments.yaml) (math, science, coding, reasoning, teaching, …) fine-tunes with [`research/finetune.py`](research/finetune.py), benchmarks with `slm-lm-eval`, gates on per-skill `goals`, and publishes passing adapters to the Hub.

	- Modal (partner track) — `modal run` / warm GPU worker, Volume artifacts, optional [Modal Notebook](research/notebook/minicpm5-modal-finetune.ipynb)
	- Well-Tuned badge — before/after lm-eval per skill + gated Hub publish (`MSGEncrypted/minicpm5-1b-<skill>-lora`)

	Full runbook: [`research/modal/README.md`](research/modal/README.md) · agent loop: [`research/modal/SERVER.md`](research/modal/SERVER.md) · local research overview: [`research/USAGE.md`](research/USAGE.md)

	```bash
	uv sync --group modal
	modal setup && modal secret create huggingface HF_TOKEN=<token>

	modal run research/modal/server_app.py --ping # health check
	modal run research/modal/server_app.py --job math-lora --max-steps 20 --no-publish # cheap smoke
	modal run research/modal/server_app.py --pipeline # full sweep: baselines → train → eval → gate → publish
	```

	Pull a passing adapter into the Space: `modal volume get slm-finetune math-lora ./models/finetuned/minicpm5-1b-lora`, then set `ACTIVE_MODEL=minicpm5-1b-lesson-lora`.

	### Llama track (Llama Champion + Off-the-Grid)

	The same OpenBMB MiniCPM-V 4.6 model runs on llama.cpp via the [`minicpm-v-4.6-gguf`](models.yaml) preset — GGUF weights from [`openbmb/MiniCPM-V-4.6-gguf`](https://huggingface.co/openbmb/MiniCPM-V-4.6-gguf) (~529 MB Q4_K_M). No cloud LLM API; inference stays fully local through [`libs/inference/src/inference/llama_cpp.py`](libs/inference/src/inference/llama_cpp.py).

	\| Preset \| Backend \| Use case \|
	\| ------ \| ------- \| -------- \|
	\| `minicpm-v-4.6` \| transformers \| Full VLM (image/video) via Hugging Face \|
	\| `minicpm-v-4.6-gguf` \| llama.cpp \| Llama Champion badge; text on all tabs today \|

	Space (judges): pin the GGUF preset — no runtime switching for visitors.

	```bash
	ACTIVE_MODEL=minicpm-v-4.6-gguf
	ALLOW_MODEL_SWITCH=false
	```

	Local dev: switch backends at runtime without restarting.

	```bash
	ALLOW_MODEL_SWITCH=true
	ACTIVE_MODEL=minicpm-v-4.6 # transformers startup default
	# Settings or Chat → select minicpm-v-4.6-gguf for llama.cpp
	```

	Prefetch weights (optional):

	```bash
	uv run python scripts/download_model.py --preset minicpm-v-4.6-gguf
	```

	See [USAGE.md](USAGE.md) (section Switching models locally) for Classic and Studio UI details.

	## How it works

	1. Skill — `skills/education-pptx/SKILL.md` (Hermes / agentskills.io format)
	2. LLM — local model drafts a JSON slide outline
	3. Tool — `create_pptx` builds the file with `python-pptx`
	4. Trace — JSON log saved under `outputs/traces/` for the Sharing is Caring badge

	```text
	apps/gradio-space/ # Gradio tabs (Lesson slides, Research Agent, Chat debug)
	libs/agent/ # Skill agent runner, tools, trace recorder
	libs/researchmind/ # Scraper, chunk/embed, MemRAG SQLite store, retrieval
	libs/inference/ # Transformers + llama.cpp backends
	skills/ # SKILL.md + references/ + scripts/ per task
	research/ # Fine-tune and agentic evals (optional)
	```

	### ResearchMind (offline after ingest)

	1. Skills — `skills/scrape-web`, `scrape-pdf`, `extract-content`, `research-mind`
	2. Ingest — URL/PDF/DOCX or topic → (optional LLM URL suggest + confirm, or auto search) → chunk + embed → SQLite
	3. Q&A — local model + retrieved chunks with `[n]` citations (no network at chat time)
	4. Memory — persists under `RESEARCHMIND_DATA_DIR` (default `outputs/researchmind`)

	Optional research tooling (not required for the Space): see [research/USAGE.md](research/USAGE.md).

	## Environment variables

	\| Variable \| Default \| Description \|
	\| -------- \| ------- \| ----------- \|
	\| `ACTIVE_MODEL` \| `minicpm5-1b` \| Preset key from `models.yaml` (use `minicpm-v-4.6-gguf` for Llama track) \|
	\| `ALLOW_MODEL_SWITCH` \| `false` \| Set `true` locally to switch presets in Settings / Chat \|
	\| `AGENT_OUTPUTS_DIR` \| `/tmp/agent_outputs` \| Generated `.pptx` files \|
	\| `AGENT_TRACES_DIR` \| `outputs/traces` \| Agent trace JSON \|
	\| `SKILLS_DIR` \| `./skills` \| Skill definitions root \|
	\| `RESEARCHMIND_DATA_DIR` \| `outputs/researchmind` \| MemRAG DB and raw snapshots \|
	\| `RESEARCHMIND_EMBED_MODEL` \| `all-MiniLM-L6-v2` \| Sentence embedding model \|
	\| `RESEARCHMIND_AUTO_SEARCH` \| `false` \| Default auto DuckDuckGo ingest \|

	See [`.env.example`](.env.example) and [`models.yaml`](models.yaml) for model presets.

	## Hugging Face Space deployment

	1. Create a Space under [build-small-hackathon](https://huggingface.co/build-small-hackathon) with Gradio SDK (Blank template).
	2. Link this repository — HF builds from root `app.py` + `requirements.txt` (README YAML above).
	3. Hardware: ZeroGPU for burst GPU inference, or GPU basic for always-on GPU.
	4. Set `ACTIVE_MODEL=minicpm5-1b` (or `minicpm-v-4.6-gguf` for [Llama track](#llama-track-llama-champion--off-the-grid)), `ALLOW_MODEL_SWITCH=false`, `RESEARCHMIND_DATA_DIR=/tmp/researchmind`.

	A root `Dockerfile` is kept for a later Docker SDK deploy (flip README to `sdk: docker`). See [USAGE.md](USAGE.md).

	## Hackathon tracks & checklist

	\| Track \| What we ship \|
	\| ----- \| ------------ \|
	\| Backyard AI (primary) \| Lesson slide builder for a teacher you know — topic + grade → downloadable `.pptx` \|
	\| Off Brand \| Custom Studio UI at `/` (Gradio 6 Server mode, not default Gradio chrome) \|
	\| Modal (partner) \| GPU `train → eval → gate → publish` on [Modal](https://modal.com) — [`research/modal/`](research/modal/) \|
	\| Well-Tuned (finetuning) \| Skill-matrix QLoRA adapters on MiniCPM5-1B, lm-eval gates, Hub publish \|
	\| Llama Champion \| `minicpm-v-4.6-gguf` on llama.cpp — same OpenBMB VLM family, local GGUF inference \|

	- Space live under build-small-hackathon
	- Demo video: [YouTube](https://www.youtube.com/watch?v=bwtOiZvJ-7k) — real user enters topic → download `.pptx` → show agent trace
	- Blog post: [Small Models, Bounded Jobs](https://huggingface.co/blog/build-small-hackathon/lessonagent-opennotebook)
	- Social post published: [X](https://x.com/MSG_Encrypted/status/2066570320861921748)
	- Submission by June 15, 2026

	### Badge targets

	- Best Agent — skill loop + `create_pptx` tool
	- Tiny Titan — MiniCPM5 1B (≤4B)
	- OpenBMB — `openbmb/MiniCPM5-1B`
	- Sharing is Caring — upload traces with `scripts/upload_trace.py`
	- Off-the-Grid — local inference only (no cloud LLM API)
	- Llama Champion — llama.cpp backend with [`openbmb/MiniCPM-V-4.6-gguf`](https://huggingface.co/openbmb/MiniCPM-V-4.6-gguf); see [Llama track](#llama-track-llama-champion--off-the-grid)
	- Well-Tuned — per-skill QLoRA adapters trained + gated + published via the [Modal + Fine-tuning track](#modal--fine-tuning-track-well-tuned)
	- Modal — same pipeline; see [`research/modal/README.md`](research/modal/README.md)

	## Agent trace upload

	```bash
	uv run python scripts/upload_trace.py --repo-id YOUR_USER/build-small-agent-traces
	```

	## Demo video script

	1. Introduce the teacher and the problem (building a 5-slide lesson takes 30+ minutes).
	2. Open Lesson slides, enter topic + grade, click Generate.
	3. Show outline preview and download the `.pptx`.
	4. Expand the agent trace JSON — local model, no cloud API.