| | --- |
| | title: CurvOpt SmarterModels |
| | emoji: 📊 |
| | colorFrom: red |
| | colorTo: red |
| | sdk: gradio |
| | sdk_version: 6.6.0 |
| | app_file: app.py |
| | pinned: false |
| | license: apache-2.0 |
| | short_description: Smarter Models, Smaller Footprint |
| | --- |
| | # CurvOpt-LLM — Realtime Optimizer |
| |
|
| | **Curvature-guided mixed-precision optimization for LLMs. No retraining required.** |
| |
|
| | ## What This Does |
| | - Loads any HuggingFace causal LM |
| | - Computes Fisher diagonal curvature per layer (real gradients) |
| | - Assigns FP32 / FP16 / BF16 per layer based on sensitivity |
| | - Rewrites and saves a deployable optimized model (downloadable ZIP) |
| | - Reports electricity, CO₂, and water footprint savings |
| |
|
| | ## How to Use |
| | 1. Select a model from the dropdown (or enter a custom HF model ID) |
| | 2. Set calibration samples (1–32) and PPL tolerance |
| | 3. Click **Run Optimization** |
| | 4. Download the optimized model ZIP when done |
| |
|
| | ## Supported Models |
| | OPT family · GPT-2 family · Pythia · Phi · BLOOM · Mistral · Llama-2 · Qwen · Falcon · and any `AutoModelForCausalLM` compatible model. |
| |
|
| | ## Research |
| | Based on Fisher Information / Optimal Brain Damage curvature analysis. |
| | Novel contribution: per-request curvature-gated mixed precision with user intent feedback. |
| |
|
| | Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference |
| |
|
| |
|
| |
|