---
title: FitCheck
emoji: ✅
colorFrom: indigo
colorTo: green
sdk: gradio
sdk_version: 6.16.0
app_file: app.py
python_version: "3.12"
pinned: false
license: mit
short_description: Honest, plain answers about what AI your computer can run
models:
  - nvidia/NVIDIA-Nemotron-3-Nano-4B-BF16
---

<!--
ZeroGPU is selected in the Space's Settings (the README can't set it). The
model brick (/api/ask) only loads the LLM when SPACES_ZERO_GPU is set, so
local `python app.py` stays instant.
-->

# FitCheck

**What AI can your computer actually run?** And the other way round: **what
computer do you need for the AI you want to run?**

Tell FitCheck about your machine in plain words. It answers honestly — real
models, real memory figures, real licenses, real copy-paste commands — from
chatbots to object detection, image generation, speech, and robotics.

## Why it's trustworthy

- **A deterministic engine does the math, not an AI.** Verdicts come from a
  transparent rules engine over `catalogue.json` — 83 real models verified
  against the Hugging Face API. Nothing in the verdict can be hallucinated.
- **Model sizes are exact.** For GGUF models the weights figure is the actual
  file size in bytes from the Hub — not a params-times-bits estimate. Chat
  memory uses each model's real architecture (GQA-aware), and every estimate
  includes a 0.58 GB safety buffer (the 95% load-success margin fitted from
  ~19,500 community measurements).
- **Provenance on every number.** The UI says whether a figure is an exact
  file size, a vendor-published number, community-reported, or estimated.
- **Licenses up front.** AGPL, non-commercial, and gated models are labelled
  on every card — before you build your project on one.
- **Speed estimates with receipts, not vibes.** For LLMs, FitCheck predicts
  decode tokens/sec from your memory bandwidth (decode is bandwidth-bound) and
  shows where your machine lands among **real community benchmark runs**
  ([LocalScore](https://www.localscore.ai)) on an interactive roofline chart.
  A learned predictor — following IBM's
  [LLM-Pilot methodology](https://arxiv.org/abs/2410.02425) (gradient boosting
  over hardware features, validated leave-one-accelerator-out) — replaces the
  analytical estimate **only if it beats it on hardware it never saw**;
  otherwise the labelled baseline ships. Vision and diffusion models are
  compute-bound, not bandwidth-bound, so they honestly keep memory verdicts
  only rather than fake speed numbers.
- **Conservative by design.** Three plain bands (Runs great / Tight, but works
  / Won't fit) that would rather under-promise than over-promise.

## What's inside

1. **The catalogue** — `scripts/curation.json` (hand-picked models across
   LLM, vision-language, vision, image/video generation, speech, music,
   embeddings, forecasting) enriched by `scripts/refresh_catalogue.py` from
   public Hub endpoints into `catalogue.json`. Refreshed nightly; baked in at
   build time so the running app is fully offline.
2. **The engine** (`engine/`) — pure Python memory math and honest banding.
   Also answers the reverse question: minimum vs comfortable hardware tiers
   for a goal ("Help me pick one" mode).
3. **The model brick** (`model_brick.py`) — NVIDIA Nemotron 3 Nano 4B running
   in-Space on ZeroGPU (hybrid Mamba-2, accelerated by prebuilt hub kernels),
   explaining the engine's numbers in plain words. It never does the math; if
   it states a figure that isn't in the engine's facts, the gate logs it.
4. **The frontend** (`static/`) — hand-built HTML/CSS/JS, no framework, served
   by Gradio server mode (`gr.Server`). Optional extra: paste any Hugging Face
   model id and FitCheck walks its finetune/quantized lineage to a known base
   ("if the base runs, your finetune runs") — the one clearly-labelled online
   feature.

## Run it locally

```
python -m venv .venv
.venv\Scripts\activate
pip install -r requirements.txt
python app.py
```

Open http://127.0.0.1:7860/ (add `?go` for an instant sample result). Locally
the explainer reports the model isn't loaded (it only loads on the Space) —
everything else works fully offline.

Built for the [Build Small hackathon](https://huggingface.co/build-small-hackathon)
(Backyard AI track).