lesson-agent-dev / README.md
MSG
Feat/last sprintos (#23)
28543d3
|
Raw
History Blame Contribute Delete
10.5 kB

A newer version of the Gradio SDK is available: 6.19.0

Upgrade
metadata
title: Lesson Agent
emoji: πŸ“š
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 6.16.0
app_file: app.py
python_version: '3.12'
pinned: false
license: apache-2.0
tags:
  - track:backyard
  - track:wood
  - sponsor:openbmb
  - sponsor:openai
  - sponsor:nvidia
  - sponsor:modal
  - achievement:offgrid
  - achievement:welltuned
  - achievement:offbrand
  - achievement:llama
  - achievement:sharing
  - achievement:fieldnotes
  - build-small-hackathon
  - backyard-ai
  - modal
  - tiny-titan
  - best-agent
  - best-demo
  - openbmb
  - sharing-is-caring
  - off-the-grid
  - off-brand
  - field-notes
  - well-tuned
  - llama-champion

Lesson Agent

Backyard AI Gradio Space for the Build Small Hackathon.

A local skill-based agent helps a teacher you know turn a topic + grade level into a downloadable PowerPoint β€” powered by a small transformers model (MiniCPM5-1B by default), no cloud LLM API.

See USAGE.md for local run, Gradio SDK / ZeroGPU Space deployment, and Docker (later).

Demo video: https://www.youtube.com/watch?v=bwtOiZvJ-7k

Blog post: Small Models, Bounded Jobs β€” Hugging Face Build Small Hackathon write-up

X post: https://x.com/MSG_Encrypted/status/2066570320861921748

Github: https://github.com/MSghais/small-model-hackathon/

Prerequisites

  • uv
  • Python 3.12

Quick start

uv sync --all-packages
cp .env.example .env   # optional: edit model settings

# Run Gradio locally
uv run --package gradio-space python -m gradio_space.app

Open http://localhost:7860.

  • Lesson slides β€” topic, grade, slide count β†’ downloadable PowerPoint
  • Research Agent β€” scrape/index sources into MemRAG, then ask questions offline with citations

Studio UI (Off Brand track)

The default landing page is a custom AI Studio workspace at / β€” not default Gradio chrome. It uses Gradio 6 Server mode (gradio.Server): Material 3 layout, sidebar + workspace (Research β†’ Slides β†’ Language lessons), and @server.api endpoints wired to the same Python backends as Classic.

  • / β€” Studio UI (ingest sources, generate slides, Language lessons multilingual coach)
  • /classic β€” full Gradio Blocks app (TeacherVoice, EchoCoach pitch analysis, settings, Chat debug)

See apps/gradio-space/README.md for API names and a 2-minute judge demo script.

Modal + Fine-tuning track (Well-Tuned)

Cloud GPU train β†’ eval β†’ gate β†’ publish for a skill-matrix of QLoRA adapters on openbmb/MiniCPM5-1B β€” no local CUDA required. Each job in research/modal/experiments.yaml (math, science, coding, reasoning, teaching, …) fine-tunes with research/finetune.py, benchmarks with slm-lm-eval, gates on per-skill goals, and publishes passing adapters to the Hub.

  • Modal (partner track) β€” modal run / warm GPU worker, Volume artifacts, optional Modal Notebook
  • Well-Tuned badge β€” before/after lm-eval per skill + gated Hub publish (MSGEncrypted/minicpm5-1b-<skill>-lora)

Full runbook: research/modal/README.md Β· agent loop: research/modal/SERVER.md Β· local research overview: research/USAGE.md

uv sync --group modal
modal setup && modal secret create huggingface HF_TOKEN=<token>

modal run research/modal/server_app.py --ping                       # health check
modal run research/modal/server_app.py --job math-lora --max-steps 20 --no-publish   # cheap smoke
modal run research/modal/server_app.py --pipeline                   # full sweep: baselines β†’ train β†’ eval β†’ gate β†’ publish

Pull a passing adapter into the Space: modal volume get slm-finetune math-lora ./models/finetuned/minicpm5-1b-lora, then set ACTIVE_MODEL=minicpm5-1b-lesson-lora.

Llama track (Llama Champion + Off-the-Grid)

The same OpenBMB MiniCPM-V 4.6 model runs on llama.cpp via the minicpm-v-4.6-gguf preset β€” GGUF weights from openbmb/MiniCPM-V-4.6-gguf (~529 MB Q4_K_M). No cloud LLM API; inference stays fully local through libs/inference/src/inference/llama_cpp.py.

Preset Backend Use case
minicpm-v-4.6 transformers Full VLM (image/video) via Hugging Face
minicpm-v-4.6-gguf llama.cpp Llama Champion badge; text on all tabs today

Space (judges): pin the GGUF preset β€” no runtime switching for visitors.

ACTIVE_MODEL=minicpm-v-4.6-gguf
ALLOW_MODEL_SWITCH=false

Local dev: switch backends at runtime without restarting.

ALLOW_MODEL_SWITCH=true
ACTIVE_MODEL=minicpm-v-4.6          # transformers startup default
# Settings or Chat β†’ select minicpm-v-4.6-gguf for llama.cpp

Prefetch weights (optional):

uv run python scripts/download_model.py --preset minicpm-v-4.6-gguf

See USAGE.md (section Switching models locally) for Classic and Studio UI details.

How it works

  1. Skill β€” skills/education-pptx/SKILL.md (Hermes / agentskills.io format)
  2. LLM β€” local model drafts a JSON slide outline
  3. Tool β€” create_pptx builds the file with python-pptx
  4. Trace β€” JSON log saved under outputs/traces/ for the Sharing is Caring badge
apps/gradio-space/   # Gradio tabs (Lesson slides, Research Agent, Chat debug)
libs/agent/          # Skill agent runner, tools, trace recorder
libs/researchmind/   # Scraper, chunk/embed, MemRAG SQLite store, retrieval
libs/inference/      # Transformers + llama.cpp backends
skills/              # SKILL.md + references/ + scripts/ per task
research/            # Fine-tune and agentic evals (optional)

ResearchMind (offline after ingest)

  1. Skills β€” skills/scrape-web, scrape-pdf, extract-content, research-mind
  2. Ingest β€” URL/PDF/DOCX or topic β†’ (optional LLM URL suggest + confirm, or auto search) β†’ chunk + embed β†’ SQLite
  3. Q&A β€” local model + retrieved chunks with [n] citations (no network at chat time)
  4. Memory β€” persists under RESEARCHMIND_DATA_DIR (default outputs/researchmind)

Optional research tooling (not required for the Space): see research/USAGE.md.

Environment variables

Variable Default Description
ACTIVE_MODEL minicpm5-1b Preset key from models.yaml (use minicpm-v-4.6-gguf for Llama track)
ALLOW_MODEL_SWITCH false Set true locally to switch presets in Settings / Chat
AGENT_OUTPUTS_DIR /tmp/agent_outputs Generated .pptx files
AGENT_TRACES_DIR outputs/traces Agent trace JSON
SKILLS_DIR ./skills Skill definitions root
RESEARCHMIND_DATA_DIR outputs/researchmind MemRAG DB and raw snapshots
RESEARCHMIND_EMBED_MODEL all-MiniLM-L6-v2 Sentence embedding model
RESEARCHMIND_AUTO_SEARCH false Default auto DuckDuckGo ingest

See .env.example and models.yaml for model presets.

Hugging Face Space deployment

  1. Create a Space under build-small-hackathon with Gradio SDK (Blank template).
  2. Link this repository β€” HF builds from root app.py + requirements.txt (README YAML above).
  3. Hardware: ZeroGPU for burst GPU inference, or GPU basic for always-on GPU.
  4. Set ACTIVE_MODEL=minicpm5-1b (or minicpm-v-4.6-gguf for Llama track), ALLOW_MODEL_SWITCH=false, RESEARCHMIND_DATA_DIR=/tmp/researchmind.

A root Dockerfile is kept for a later Docker SDK deploy (flip README to sdk: docker). See USAGE.md.

Hackathon tracks & checklist

Track What we ship
Backyard AI (primary) Lesson slide builder for a teacher you know β€” topic + grade β†’ downloadable .pptx
Off Brand Custom Studio UI at / (Gradio 6 Server mode, not default Gradio chrome)
Modal (partner) GPU train β†’ eval β†’ gate β†’ publish on Modal β€” research/modal/
Well-Tuned (finetuning) Skill-matrix QLoRA adapters on MiniCPM5-1B, lm-eval gates, Hub publish
Llama Champion minicpm-v-4.6-gguf on llama.cpp β€” same OpenBMB VLM family, local GGUF inference
  • Space live under build-small-hackathon
  • Demo video: YouTube β€” real user enters topic β†’ download .pptx β†’ show agent trace
  • Blog post: Small Models, Bounded Jobs
  • Social post published: X
  • Submission by June 15, 2026

Badge targets

  • Best Agent β€” skill loop + create_pptx tool
  • Tiny Titan β€” MiniCPM5 1B (≀4B)
  • OpenBMB β€” openbmb/MiniCPM5-1B
  • Sharing is Caring β€” upload traces with scripts/upload_trace.py
  • Off-the-Grid β€” local inference only (no cloud LLM API)
  • Llama Champion β€” llama.cpp backend with openbmb/MiniCPM-V-4.6-gguf; see Llama track
  • Well-Tuned β€” per-skill QLoRA adapters trained + gated + published via the Modal + Fine-tuning track
  • Modal β€” same pipeline; see research/modal/README.md

Agent trace upload

uv run python scripts/upload_trace.py --repo-id YOUR_USER/build-small-agent-traces

Demo video script

  1. Introduce the teacher and the problem (building a 5-slide lesson takes 30+ minutes).
  2. Open Lesson slides, enter topic + grade, click Generate.
  3. Show outline preview and download the .pptx.
  4. Expand the agent trace JSON β€” local model, no cloud API.