Spaces:

build-small-hackathon
/

FitCheck

Running on Zero

App Files Files Community

FitCheck / README.md

cn0303

Speed predictions with receipts: bandwidth roofline, real-runs chart, honest provenance

ee8ca43 verified about 15 hours ago

preview code

raw

history blame contribute delete

4.28 kB

A newer version of the Gradio SDK is available: 6.17.3

Upgrade

metadata

title: FitCheck
emoji: ✅
colorFrom: indigo
colorTo: green
sdk: gradio
sdk_version: 6.16.0
app_file: app.py
python_version: '3.12'
pinned: false
license: mit
short_description: Honest, plain answers about what AI your computer can run
models:
  - nvidia/NVIDIA-Nemotron-3-Nano-4B-BF16

FitCheck

What AI can your computer actually run? And the other way round: what computer do you need for the AI you want to run?

Tell FitCheck about your machine in plain words. It answers honestly — real models, real memory figures, real licenses, real copy-paste commands — from chatbots to object detection, image generation, speech, and robotics.

Why it's trustworthy

A deterministic engine does the math, not an AI. Verdicts come from a transparent rules engine over catalogue.json — 83 real models verified against the Hugging Face API. Nothing in the verdict can be hallucinated.
Model sizes are exact. For GGUF models the weights figure is the actual file size in bytes from the Hub — not a params-times-bits estimate. Chat memory uses each model's real architecture (GQA-aware), and every estimate includes a 0.58 GB safety buffer (the 95% load-success margin fitted from ~19,500 community measurements).
Provenance on every number. The UI says whether a figure is an exact file size, a vendor-published number, community-reported, or estimated.
Licenses up front. AGPL, non-commercial, and gated models are labelled on every card — before you build your project on one.
Speed estimates with receipts, not vibes. For LLMs, FitCheck predicts decode tokens/sec from your memory bandwidth (decode is bandwidth-bound) and shows where your machine lands among real community benchmark runs (LocalScore) on an interactive roofline chart. A learned predictor — following IBM's LLM-Pilot methodology (gradient boosting over hardware features, validated leave-one-accelerator-out) — replaces the analytical estimate only if it beats it on hardware it never saw; otherwise the labelled baseline ships. Vision and diffusion models are compute-bound, not bandwidth-bound, so they honestly keep memory verdicts only rather than fake speed numbers.
Conservative by design. Three plain bands (Runs great / Tight, but works / Won't fit) that would rather under-promise than over-promise.

What's inside

The catalogue — scripts/curation.json (hand-picked models across LLM, vision-language, vision, image/video generation, speech, music, embeddings, forecasting) enriched by scripts/refresh_catalogue.py from public Hub endpoints into catalogue.json. Refreshed nightly; baked in at build time so the running app is fully offline.
The engine (engine/) — pure Python memory math and honest banding. Also answers the reverse question: minimum vs comfortable hardware tiers for a goal ("Help me pick one" mode).
The model brick (model_brick.py) — NVIDIA Nemotron 3 Nano 4B running in-Space on ZeroGPU (hybrid Mamba-2, accelerated by prebuilt hub kernels), explaining the engine's numbers in plain words. It never does the math; if it states a figure that isn't in the engine's facts, the gate logs it.
The frontend (static/) — hand-built HTML/CSS/JS, no framework, served by Gradio server mode (gr.Server). Optional extra: paste any Hugging Face model id and FitCheck walks its finetune/quantized lineage to a known base ("if the base runs, your finetune runs") — the one clearly-labelled online feature.

Run it locally

python -m venv .venv
.venv\Scripts\activate
pip install -r requirements.txt
python app.py

Open http://127.0.0.1:7860/ (add ?go for an instant sample result). Locally the explainer reports the model isn't loaded (it only loads on the Space) — everything else works fully offline.

Built for the Build Small hackathon (Backyard AI track).