Spaces:

MSGEncrypted
/

lesson-agent-dev

Sleeping

App Files Files Community

lesson-agent-dev / README.md

MSG

Feat/last sprintos (#23)

28543d3 9 days ago

preview code

Raw

History Blame Contribute Delete

10.5 kB

A newer version of the Gradio SDK is available: 6.19.0

Upgrade

metadata

title: Lesson Agent
emoji: 📚
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 6.16.0
app_file: app.py
python_version: '3.12'
pinned: false
license: apache-2.0
tags:
  - track:backyard
  - track:wood
  - sponsor:openbmb
  - sponsor:openai
  - sponsor:nvidia
  - sponsor:modal
  - achievement:offgrid
  - achievement:welltuned
  - achievement:offbrand
  - achievement:llama
  - achievement:sharing
  - achievement:fieldnotes
  - build-small-hackathon
  - backyard-ai
  - modal
  - tiny-titan
  - best-agent
  - best-demo
  - openbmb
  - sharing-is-caring
  - off-the-grid
  - off-brand
  - field-notes
  - well-tuned
  - llama-champion

Lesson Agent

Backyard AI Gradio Space for the Build Small Hackathon.

A local skill-based agent helps a teacher you know turn a topic + grade level into a downloadable PowerPoint — powered by a small transformers model (MiniCPM5-1B by default), no cloud LLM API.

See USAGE.md for local run, Gradio SDK / ZeroGPU Space deployment, and Docker (later).

Demo video: https://www.youtube.com/watch?v=bwtOiZvJ-7k

Blog post: Small Models, Bounded Jobs — Hugging Face Build Small Hackathon write-up

X post: https://x.com/MSG_Encrypted/status/2066570320861921748

Github: https://github.com/MSghais/small-model-hackathon/

Prerequisites

uv
Python 3.12

Quick start

uv sync --all-packages
cp .env.example .env   # optional: edit model settings

# Run Gradio locally
uv run --package gradio-space python -m gradio_space.app

Open http://localhost:7860.

Lesson slides — topic, grade, slide count → downloadable PowerPoint
Research Agent — scrape/index sources into MemRAG, then ask questions offline with citations

Studio UI (Off Brand track)

The default landing page is a custom AI Studio workspace at / — not default Gradio chrome. It uses Gradio 6 Server mode (gradio.Server): Material 3 layout, sidebar + workspace (Research → Slides → Language lessons), and @server.api endpoints wired to the same Python backends as Classic.

/ — Studio UI (ingest sources, generate slides, Language lessons multilingual coach)
/classic — full Gradio Blocks app (TeacherVoice, EchoCoach pitch analysis, settings, Chat debug)

See apps/gradio-space/README.md for API names and a 2-minute judge demo script.

Modal + Fine-tuning track (Well-Tuned)

Cloud GPU train → eval → gate → publish for a skill-matrix of QLoRA adapters on openbmb/MiniCPM5-1B — no local CUDA required. Each job in research/modal/experiments.yaml (math, science, coding, reasoning, teaching, …) fine-tunes with research/finetune.py, benchmarks with slm-lm-eval, gates on per-skill goals, and publishes passing adapters to the Hub.

Modal (partner track) — modal run / warm GPU worker, Volume artifacts, optional Modal Notebook
Well-Tuned badge — before/after lm-eval per skill + gated Hub publish (MSGEncrypted/minicpm5-1b-<skill>-lora)

Full runbook: research/modal/README.md · agent loop: research/modal/SERVER.md · local research overview: research/USAGE.md

uv sync --group modal
modal setup && modal secret create huggingface HF_TOKEN=<token>

modal run research/modal/server_app.py --ping                       # health check
modal run research/modal/server_app.py --job math-lora --max-steps 20 --no-publish   # cheap smoke
modal run research/modal/server_app.py --pipeline                   # full sweep: baselines → train → eval → gate → publish

Pull a passing adapter into the Space: modal volume get slm-finetune math-lora ./models/finetuned/minicpm5-1b-lora, then set ACTIVE_MODEL=minicpm5-1b-lesson-lora.

Llama track (Llama Champion + Off-the-Grid)

The same OpenBMB MiniCPM-V 4.6 model runs on llama.cpp via the minicpm-v-4.6-gguf preset — GGUF weights from openbmb/MiniCPM-V-4.6-gguf (~529 MB Q4_K_M). No cloud LLM API; inference stays fully local through libs/inference/src/inference/llama_cpp.py.

Preset	Backend	Use case
`minicpm-v-4.6`	transformers	Full VLM (image/video) via Hugging Face
`minicpm-v-4.6-gguf`	llama.cpp	Llama Champion badge; text on all tabs today

Space (judges): pin the GGUF preset — no runtime switching for visitors.

ACTIVE_MODEL=minicpm-v-4.6-gguf
ALLOW_MODEL_SWITCH=false

Local dev: switch backends at runtime without restarting.

ALLOW_MODEL_SWITCH=true
ACTIVE_MODEL=minicpm-v-4.6          # transformers startup default
# Settings or Chat → select minicpm-v-4.6-gguf for llama.cpp

Prefetch weights (optional):

uv run python scripts/download_model.py --preset minicpm-v-4.6-gguf

See USAGE.md (section Switching models locally) for Classic and Studio UI details.

How it works

Skill — skills/education-pptx/SKILL.md (Hermes / agentskills.io format)
LLM — local model drafts a JSON slide outline
Tool — create_pptx builds the file with python-pptx
Trace — JSON log saved under outputs/traces/ for the Sharing is Caring badge

apps/gradio-space/   # Gradio tabs (Lesson slides, Research Agent, Chat debug)
libs/agent/          # Skill agent runner, tools, trace recorder
libs/researchmind/   # Scraper, chunk/embed, MemRAG SQLite store, retrieval
libs/inference/      # Transformers + llama.cpp backends
skills/              # SKILL.md + references/ + scripts/ per task
research/            # Fine-tune and agentic evals (optional)

ResearchMind (offline after ingest)

Skills — skills/scrape-web, scrape-pdf, extract-content, research-mind
Ingest — URL/PDF/DOCX or topic → (optional LLM URL suggest + confirm, or auto search) → chunk + embed → SQLite
Q&A — local model + retrieved chunks with [n] citations (no network at chat time)
Memory — persists under RESEARCHMIND_DATA_DIR (default outputs/researchmind)

Optional research tooling (not required for the Space): see research/USAGE.md.

Environment variables

Variable	Default	Description
`ACTIVE_MODEL`	`minicpm5-1b`	Preset key from `models.yaml` (use `minicpm-v-4.6-gguf` for Llama track)
`ALLOW_MODEL_SWITCH`	`false`	Set `true` locally to switch presets in Settings / Chat
`AGENT_OUTPUTS_DIR`	`/tmp/agent_outputs`	Generated `.pptx` files
`AGENT_TRACES_DIR`	`outputs/traces`	Agent trace JSON
`SKILLS_DIR`	`./skills`	Skill definitions root
`RESEARCHMIND_DATA_DIR`	`outputs/researchmind`	MemRAG DB and raw snapshots
`RESEARCHMIND_EMBED_MODEL`	`all-MiniLM-L6-v2`	Sentence embedding model
`RESEARCHMIND_AUTO_SEARCH`	`false`	Default auto DuckDuckGo ingest

See .env.example and models.yaml for model presets.

Hugging Face Space deployment

Create a Space under build-small-hackathon with Gradio SDK (Blank template).
Link this repository — HF builds from root app.py + requirements.txt (README YAML above).
Hardware: ZeroGPU for burst GPU inference, or GPU basic for always-on GPU.
Set ACTIVE_MODEL=minicpm5-1b (or minicpm-v-4.6-gguf for Llama track), ALLOW_MODEL_SWITCH=false, RESEARCHMIND_DATA_DIR=/tmp/researchmind.

A root Dockerfile is kept for a later Docker SDK deploy (flip README to sdk: docker). See USAGE.md.

Hackathon tracks & checklist

Track	What we ship
Backyard AI (primary)	Lesson slide builder for a teacher you know — topic + grade → downloadable `.pptx`
Off Brand	Custom Studio UI at `/` (Gradio 6 Server mode, not default Gradio chrome)
Modal (partner)	GPU `train → eval → gate → publish` on Modal — `research/modal/`
Well-Tuned (finetuning)	Skill-matrix QLoRA adapters on MiniCPM5-1B, lm-eval gates, Hub publish
Llama Champion	`minicpm-v-4.6-gguf` on llama.cpp — same OpenBMB VLM family, local GGUF inference

Space live under build-small-hackathon
Demo video: YouTube — real user enters topic → download .pptx → show agent trace
Blog post: Small Models, Bounded Jobs
Social post published: X
Submission by June 15, 2026

Badge targets

Best Agent — skill loop + create_pptx tool
Tiny Titan — MiniCPM5 1B (≤4B)
OpenBMB — openbmb/MiniCPM5-1B
Sharing is Caring — upload traces with scripts/upload_trace.py
Off-the-Grid — local inference only (no cloud LLM API)
Llama Champion — llama.cpp backend with openbmb/MiniCPM-V-4.6-gguf; see Llama track
Well-Tuned — per-skill QLoRA adapters trained + gated + published via the Modal + Fine-tuning track
Modal — same pipeline; see research/modal/README.md

Agent trace upload

uv run python scripts/upload_trace.py --repo-id YOUR_USER/build-small-agent-traces

Demo video script

Introduce the teacher and the problem (building a 5-slide lesson takes 30+ minutes).
Open Lesson slides, enter topic + grade, click Generate.
Show outline preview and download the .pptx.
Expand the agent trace JSON — local model, no cloud API.