Spaces:
Running
Running
| title: SmolLM2 360M Instruct | |
| emoji: ๐ | |
| colorFrom: yellow | |
| colorTo: green | |
| sdk: gradio | |
| sdk_version: 6.9.0 | |
| app_file: app.py | |
| pinned: false | |
| license: apache-2.0 | |
| short_description: 'SmolLM2-360M-Instruct ' | |
| # SmolLM2 360M Instruct Demo | |
| This Space demonstrates the SmolLM2-360M-Instruct model with a CPU fallback mechanism. It is designed to run efficiently even on the Hugging Face Free Tier (2 vCPUs). | |
| ## Overview | |
| A minimal but production-ready LLM service built on: | |
| * **Model:** SmolLM2-360M-Instruct (approx. 269MB, Apache 2.0). | |
| * **Efficiency:** Optimized to run on 2 CPUs and minimum 2 GB RAM (HF tier supports up to 16 GB). | |
| * **Scalability:** Perfect for local training and testing. | |
| ## Related Project: SmolLM2-customs | |
| If you are interested in training small LLMs the lazy way, check out: | |
| [https://github.com/VolkanSah/SmolLM2-customs](https://github.com/VolkanSah/SmolLM2-customs) | |
| **Features of the custom implementation:** | |
| * **FastAPI:** OpenAI-compatible `/v1/chat/completions` endpoint. | |
| * **ADI (Anti-Dump Index):** Filters low-quality requests before they hit the model. | |
| * **HF Dataset Integration:** Logs every request for later analysis and finetuning. | |
| --- | |
| ## Deployment & Usage | |
| You do not need an API key for this public demo, but rate limits apply. | |
| ### How to run your own instance: | |
| 1. **Duplicate/Clone** this Space. | |
| 2. **Environment Variables:** To use your own model access or private weights, add one of the following keys to your **Secrets**: | |
| * `HF_TOKEN` | |
| * `TEST_TOKEN` | |
| * `HUGGINGFACE_TOKEN` | |
| * `HF_API_TOKEN` | |
| The code uses a flexible token resolution logic to ensure compatibility with older or custom keys. | |
| ## Technical Details | |
| The inference pipeline uses `transformers` with `torch`. It automatically detects if a GPU is available; otherwise, it falls back to CPU execution without breaking the Gradio interface. | |