Spaces:
Running on Zero
Running on Zero
| # FormScout β Starter Kit & Resource Pack | |
| Companion to `FormScout-FMS-Spec.md` and `FormScout-Build-Prompt.md`. Every link below was checked. Read Β§1 first β some items are time-sensitive and block the build if you leave them late. | |
| --- | |
| ## 1. Do this NOW (before the hack window β some take hours to clear) | |
| - [ ] **Request access to the gated Meta checkpoints today.** Both are gated on Hugging Face and approval isn't instant: | |
| - SAM 3 / SAM 3.1 β request on the SAM 3 repos (you need the latest code for the 3.1 checkpoints). | |
| - SAM 3D Body β `facebook/sam-3d-body-dinov3` and `facebook/sam-3d-body-vith` both require an access request, then an authenticated download. **Note:** data/checkpoints are blocked in sanctioned jurisdictions β shouldn't affect SK, but verify. | |
| - [ ] **Put your HF token in the Space secrets** so the Space can pull the gated weights at build time. | |
| - [ ] **Check licenses before you commit to a model** (this affects whether you can even submit): | |
| - Qwen3-VL-8B / Qwen3-VL-Embedding-8B / Qwen3.6 β **Apache-2.0** (clean). | |
| - SAM 3 / SAM 3.1 / SAM 3D Body β **SAM License** (not Apache; read the terms β there are use restrictions). | |
| - Ultralytics YOLO26 β historically **AGPL-3.0** (open-sourcing obligations; commercial license exists). Verify on the model/repo and make sure an AGPL dependency is OK for your submission. If it's a problem, RTMPose/ViTPose are alternatives. | |
| - pyskl / MMAction2 β Apache-2.0. | |
| - KIMORE / UI-PRMD β academic/research terms; check before redistributing anything derived. | |
| - [ ] **Confirm the param-counting rule in the Discord AMA.** Specifically: (a) is it summed across the pipeline or per-model? (b) do **frozen** base models count? (c) does a LoRA adapter's base count? Your ~18B config is safe under the strict reading either way, but get it on record. | |
| --- | |
| ## 2. Literature package | |
| ### 2.1 The framing that wins β "evaluate like an FMS reliability study" | |
| The single most credible move in your writeup: evaluate FormScout the way the clinical literature evaluates human FMS raters. Treat the model as a *second rater* and report **weighted Cohen's ΞΊ** and **ICC** against the physio, the exact metrics the reliability papers use. That instantly makes your results legible to any sports-medicine reader and is far more honest than a vanity accuracy number. | |
| | Resource | What it gives you | Link | | |
| |---|---|---| | |
| | Physiopedia β FMS | Clean overview of the 7 tests + 0β21 scoring | https://www.physio-pedia.com/Functional_Movement_Screen_(FMS) | | |
| | FMS reliability study (JOSPT 2012) | The ICC/ΞΊ numbers and method you'll mirror in your eval | https://www.jospt.org/doi/10.2519/jospt.2012.3838 | | |
| | FMS in elite youth soccer (PMC) | Per-test scores, asymmetries, clearing-test order | https://pmc.ncbi.nlm.nih.gov/articles/PMC5675373/ | | |
| | Clinician's guide to FMS scoring | Per-test 3/2/1 criteria in plain language (rubric source) | https://meloqdevices.com/blogs/meloq-updates/functional-movement-screening | | |
| > **Honesty anchor for the blog post:** the popular "β€14 β injury risk" cutoff has weak/mixed predictive validity. Sell standardization, asymmetry detection, and a repeatable baseline β not prediction. | |
| ### 2.2 Action Quality Assessment β surveys & living lists | |
| | Resource | Why | Link | | |
| |---|---|---| | |
| | *A Decade of AQA* (survey, 2025, 200+ papers, PRISMA) | The map of the whole field; start here | https://arxiv.org/abs/2502.02817 Β· code: https://github.com/HaoYin116/Survey_of_AQA | | |
| | *Comprehensive Survey of AQA: Method & Benchmark* (2024) | Taxonomy by modality (video / **skeleton** / multimodal) + unified benchmark | https://arxiv.org/abs/2412.11149 Β· page: https://zhoukanglei.github.io/AQA-Survey | | |
| | Awesome-AQA (ZhouKanglei) | Curated, **has a Medical-Care/rehab section** β your closest analogues | https://github.com/ZhouKanglei/Awesome-AQA | | |
| | Awesome-AQA (Lyman-Smoker) | Second list; catches papers the other misses (FLEX, ExAct, etc.) | https://github.com/Lyman-Smoker/Awesome-AQA | | |
| ### 2.3 Skeleton-based scoring β the methods your head will borrow from | |
| | Paper | Relevance to FormScout | Link | | |
| |---|---|---| | |
| | ST-GCN (original) | The graph-over-skeleton + temporal-conv backbone | https://github.com/open-mmlab/mmaction2/blob/main/configs/skeleton/stgcn/README.md | | |
| | AQA via Hierarchical **Pose-guided** Multi-Stage Contrastive Regression (TIP 2025) | Pose-guided + contrastive regression with few labels β close to your setup | https://arxiv.org/abs/2501.03674 | | |
| | Attention-guided Movement **Quality** Assessment + skeletal augmentation (UI-PRMD/KIMORE) | Transformer MQA on clinician-scored rehab data; **augmentation recipe for tiny sets** | https://arxiv.org/pdf/2204.07840 | | |
| | SSL-Rehab: self-supervised 3D skeleton + **LoRA** fine-tune (KIMORE/UI-PRMD) | PretrainβLoRA recipe for small clinical datasets (uses your LoRA muscle) | https://www.sciencedirect.com/science/article/abs/pii/S1077314224003564 | | |
| | Skeleton-based AQA w/ anomaly-aware DTW (Sensors 2025) | DTW alignment + anomaly scoring; cheap, label-light baseline | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12693942/ | | |
| --- | |
| ## 3. Models & tooling (verified) | |
| | Component | Repo / card | Params | License | Gated? | | |
| |---|---|---:|---|---| | |
| | YOLO26-Pose | https://docs.ultralytics.com/tasks/pose | <0.1B | AGPL-3.0* | no | | |
| | SAM 3.1 | https://github.com/facebookresearch/sam3 | ~0.85B | SAM License | **yes** | | |
| | SAM 3D Body | https://github.com/facebookresearch/sam-3d-body Β· https://huggingface.co/facebook/sam-3d-body-dinov3 | sub-1Bβ | SAM License | **yes** | | |
| | ST-GCN++ / PoseConv3D | https://github.com/kennymckormick/pyskl | ~0.01β0.05B | Apache-2.0 | no | | |
| | Qwen3-VL-8B-Instruct | https://huggingface.co/Qwen/Qwen3-VL-8B-Instruct | 8B | Apache-2.0 | no | | |
| | Qwen3-VL-Embedding-8B | https://huggingface.co/Qwen/Qwen3-VL-Embedding-8B (GGUF: dam2452/...-GGUF) | 8B | Apache-2.0 | no | | |
| | Qwen3.6-27B (alt brain) | https://huggingface.co/unsloth/Qwen3.6-27B-GGUF | 27B | Apache-2.0 | no | | |
| \* verify the current YOLO26 license. β two variants (`dinov3`, `vith`); confirm exact count on the card β budget impact is small either way. SAM 3 itself is 848M. | |
| **Useful extras:** SAM 3D Body uses a Momentum Human Rig (MHR) that separates skeleton from soft-tissue shape β convenient for clean joint-angle extraction. The repo ships a notebook combining SAM 3D Body + SAM 3D Objects in one frame of reference. SAM 3D Body demo: https://www.aidemos.meta.com/segment-anything/editor/convert-body-to-3d | |
| --- | |
| ## 4. Datasets for transfer / pretraining | |
| You have a couple of labeled clips. Pretrain on clinician-scored movement-quality data first, then few-shot fine-tune. These are the most transferable to FMS (ranked by relevance): | |
| | Dataset | Why it's the closest analogue | Link | | |
| |---|---|---| | |
| | **KIMORE** | Clinician **scores** of low-back-pain rehab exercises (trunk control, multi-plane) β same "score movement quality" task as FMS; partially overlaps Deep Squat / Rotary Stability / TSPU mechanics | https://www.researchgate.net/publication/333791841 (search "KIMORE dataset") | | |
| | **UI-PRMD** | 10 rehab movements, correct vs. incorrect executions; standard MQA benchmark, pairs with KIMORE | search "UI-PRMD University of Idaho Physical Rehabilitation Movements" | | |
| | **Fitness-AQA** | Real gym **squat/deadlift form errors** β directly relevant to Deep Squat compensations | https://github.com/ParitoshParmar/MTL-AQA (links Fitness-AQA) | | |
| | **FLEX** | Large multi-modal fitness AQA dataset | via Lyman-Smoker/Awesome-AQA | | |
| | **MTL-AQA / AQA-7 / FineFS** | General sports AQA for backbone pretraining (diving, skating) | https://github.com/ParitoshParmar/MTL-AQA | | |
| **FMS-specific public video data is scarce** β don't expect a drop-in set. Your physio's clips are the gold; everything above is for pretraining the temporal backbone so it learns movement structure before it ever sees an FMS label. | |
| --- | |
| ## 5. Build & deploy tooling | |
| | Need | Link | | |
| |---|---| | |
| | Gradio docs (v6) | https://www.gradio.app/docs | | |
| | `gradio.Server` β custom frontend + Gradio backend (Off-Brand badge) | https://www.gradio.app/guides/server-mode Β· blog: https://huggingface.co/blog/introducing-gradio-server | | |
| | Gradio AI coding-assistant skill | `gradio skills add --claude` (PyPI: https://pypi.org/project/gradio/) | | |
| | Gradio changelog (confirm `gr.Walkthrough`, `gr.Navbar`, `gr.Video.playback_position`) | https://www.gradio.app/changelog | | |
| | HF Spaces ZeroGPU (`@spaces.GPU`) | https://huggingface.co/docs/hub/spaces-zerogpu | | |
| | llama.cpp | https://github.com/ggml-org/llama.cpp | | |
| | pyskl (ST-GCN++/PoseConv3D, custom-video tutorial incl. diving48) | https://github.com/kennymckormick/pyskl | | |
| | MMAction2 (broader video understanding) | https://github.com/open-mmlab/mmaction2 | | |
| | Hackathon's own trailheads (ML Intern, Gradio guides) | https://github.com/huggingface/ml-intern | | |
| > **Hackathon-specific gotcha already seen in the org:** another team's Space hit `libcudart.so.12` errors and had to swap llama.cpp for transformers + `spaces.GPU`. Plan for it β isolate the llama.cpp build (CPU-only or pinned-CUDA) and keep a transformers fallback. For the scoring head, a small hand-rolled ST-GCN may deploy more cleanly on a Space than the full MMAction2/pyskl stack β prototype with pyskl, ship lean. | |
| --- | |
| ## 6. Two artifacts you probably haven't made yet | |
| ### 6.1 Data & capture protocol (highest-leverage non-code work) | |
| With a tiny dataset, controlling *how* clips are captured beats any model tweak. Give the physio a one-pager: | |
| - **Camera:** one fixed position, tripod, ~3 m back, lens at hip height, landscape, 1080p/30fps+. Same setup every session β this is what makes 3D consistent and the longitudinal baseline meaningful. | |
| - **Framing:** whole body in frame for the whole rep, including the dowel. Plain-ish background, even lighting, no backlight. | |
| - **One athlete in frame** at scoring time (or note who to track). For bilateral tests, capture **both sides** and label each. | |
| - **Label schema (CSV):** `clip_id, athlete_id, date, test_name, side(L/R/NA), score(0β3), pain(bool), compensation_notes(free text), camera_view, consent_on_file(bool)`. | |
| - **One rep per clip** to start (simplest). If sessions are continuous, you'll need temporal segmentation first β flag it to the build agent at Phase 1. | |
| ### 6.2 Evaluation plan | |
| Define "good" before you train, given so few labels: | |
| - **Primary:** Spearman Ο between predicted and physio scores (the AQA-standard metric), plus **exact-match** and **Β±1 accuracy** per test. | |
| - **Clinical credibility:** **weighted Cohen's ΞΊ** and **ICC** of model-vs-physio, reported alongside the human inter-rater numbers from the JOSPT study β i.e. "how does FormScout compare to a second human rater?" | |
| - **Asymmetry:** detection rate of L/R asymmetries the physio flagged (this is one of the FMS's most defensible outputs). | |
| - **Validation:** leave-one-clip-out CV (you can't afford a held-out test split). Keep β₯1 clip the judge never sees for the demo. | |
| - **Calibration:** report when the system says "low confidence / physio review" and show it's right to do so. A well-calibrated, humble tool reads as more trustworthy than a confident one. | |
| --- | |
| ## 7. Ethics, consent & data handling (EU / Slovakia) | |
| You're filming identifiable athletes, possibly **minors** on a youth team. This is biometric personal data under GDPR β treat it as first-class, and say so in your submission (judges and physios both reward it): | |
| - **Consent:** written consent from each athlete (and a parent/guardian for anyone under 18) before any footage is used. No consent β not in the dataset, not in the demo. | |
| - **Data minimization & retention:** keep only what you need; don't persist raw clips on the Space beyond what's approved; document a retention/deletion policy. Prefer storing derived skeletons over raw video where possible. | |
| - **Demo footage:** use a consenting adult (you, a teammate) for the public demo video rather than a minor athlete, even if you trained on team data privately. | |
| - **Framing:** screening aid, not a medical device; pain/clearing tests always defer to the clinician; human-in-the-loop by design. | |
| --- | |
| ## 8. The transfer-learning recipe (ties it together) | |
| 1. **Backbone pretrain** β ST-GCN++ on a general skeleton-action set (NTU/Kinetics skeletons via pyskl) so it learns motion structure. | |
| 2. **Domain adapt** β continue on **KIMORE + UI-PRMD** (clinician-scored movement quality) so it learns *quality*, not just *what action*. | |
| 3. **Few-shot fine-tune** β **LoRA** on the physio's FMS clips with heavy augmentation (temporal jitter, **LβR mirror** to double bilateral data, 3D camera-angle perturbation, joint noise). The SSL-Rehab paper (Β§2.3) is your blueprint and it's exactly your LoRA wheelhouse. | |
| 4. **Don't over-train the head** β let deterministic biomechanics carry the demo; the learned head and RAG are the refinement and the badges, not the foundation. | |
| --- | |
| ## 9. Demo & submission storyboard (the "make it sing" 30%) | |
| The submission needs a demo video + social post; "Show, Don't Tell" is a literal rule. A tight 60β90s cut: | |
| 1. **0β10s** β the problem: physio eyeballing a squat, scribbling a score. "Same player, two raters, two scores." | |
| 2. **10β35s** β upload the clip to FormScout β skeleton overlay β 0β3 with the *deciding angle drawn on the frame* (`playback_position` jump). The "aha" shot. | |
| 3. **35β55s** β the scorecard: composite 0β21, the L/R asymmetry strip, a "low confidence β physio review" flag on a borderline case (honesty sells). | |
| 4. **55β75s** β the physio reacting / using it on a real player (the Backyard AI "they actually used it" proof). | |
| 5. **End card** β "Runs on a laptop. ~18B params. Screening aid, not a diagnosis." Link the Space, the published head, the agent trace, the blog. | |
| Social post: lead with the overlay GIF + the asymmetry-detection angle; tag Gradio/HF; one line of honest framing. | |
| --- | |
| *Built to give FormScout the best shot. The two things most teams underinvest in β the capture protocol (Β§6.1) and the honest, clinical-style evaluation (Β§6.2, Β§2.1) β are exactly where this project can out-class flashier entries. Good luck. π* | |