| --- |
| title: Stack X Ultimate |
| emoji: π₯ |
| colorFrom: purple |
| colorTo: blue |
| sdk: gradio |
| sdk_version: 6.13.0 |
| app_file: app.py |
| pinned: false |
| description: >- |
| Open-source agentic model with tool calling. Deploy on your own GPU β no API |
| costs. |
| tags: |
| - agentic |
| - tool-calling |
| - llm |
| - qwen |
| - open-source |
| - local-ai |
| license: apache-2.0 |
| --- |
| # Stack X Ultimate β Agentic Tool-Calling Model |
|
|
| <div align="center"> |
|
|
| **Open-source agentic model that calls real tools. Deploy on your GPU β no API key required.** |
|
|
| [](https://huggingface.co/Qwen/Qwen2.5-Coder-3B-Instruct) |
| [](https://github.com/QwenLM/Qwen2.5-Coder) |
| [](LICENSE) |
|
|
| </div> |
|
|
| --- |
|
|
| ## What It Does |
|
|
| Stack X Ultimate is a fine-tuned Qwen2.5-Coder-3B-Instruct optimized for **agentic tool-calling workflows**: |
|
|
| - π’ **Calculator** β evaluates mathematical expressions |
| - π **Current Time** β returns live UTC timestamp |
| - π **File Search** β glob-based file discovery in directory trees |
| - β‘ **Command Execution** β runs shell commands and returns structured output |
|
|
| The model decides *when* to call tools and *how* to interpret results β mirroring how GPT-4 and Claude handle function calling, but running entirely on your infrastructure. |
|
|
| --- |
|
|
| ## Quick Start |
|
|
| ### Try in Browser |
| Use the chat interface above β try the pre-loaded examples or type your own query. |
|
|
| ### Run Locally |
|
|
| ```bash |
| # Clone the model |
| git lfs install |
| git clone https://huggingface.co/my-ai-stack/Stack-X-Ultimate |
| |
| # Run with llama.cpp |
| ./llama.cpp -m ./Stack-X-Ultimate/unsloth.Q4_K_M.gguf \ |
| -n 8192 \ |
| --ctx-size 8192 \ |
| -p "You are a helpful AI assistant with tool calling." |
| |
| # Or with transformers |
| python3 -c " |
| from transformers import AutoModelForCausalLM, AutoTokenizer |
| model = AutoModelForCausalLM.from_pretrained('my-ai-stack/Stack-X-Ultimate') |
| tok = AutoTokenizer.from_pretrained('my-ai-stack/Stack-X-Ultimate') |
| print('Model loaded!') |
| " |
| ``` |
|
|
| --- |
|
|
| ## Architecture |
|
|
| | Component | Detail | |
| |---|---| |
| | Base Model | Qwen/Qwen2.5-Coder-3B-Instruct | |
| | Fine-tune Method | QLoRA 4-bit (r=32, all 7 target modules) | |
| | Training Data | 27,000+ agentic tool-call examples | |
| | Sequence Length | 8,192 tokens | |
| | VRAM Required | ~6 GB (Q4_K_M GGUF) | |
| | Min. Hardware | Single V100 16GB or equivalent | |
|
|
| --- |
|
|
| ## Available Tools |
|
|
| | Tool | Description | Example | |
| |---|---|---| |
| | `calculator` | Evaluate math expressions | `1500 * 0.07 * 30` | |
| | `get_current_time` | Return current UTC time | β | |
| | `search_files` | Glob-based file search | `*.py` in `./src` | |
| | `run_command` | Execute shell commands | `git status` | |
|
|
| --- |
|
|
| ## Enterprise Deployment |
|
|
| Need tool integrations specific to your stack? Want to deploy inside your VPC? |
|
|
| π¬ **[Contact Stack AI β](https://www.stack-ai.me/contact)** |
|
|
| - Custom tool integrations (APIs, databases, internal systems) |
| - VPC-isolated deployment (AWS / GCP / Azure) |
| - Air-gapped / on-prem installation |
| - Custom LoRA fine-tuning on your data |
|
|
| --- |
|
|
| ## License |
|
|
| Apache 2.0 β free for commercial and personal use. |
|
|
| --- |
|
|
| *Stack AI β Sovereign AI Infrastructure. [huggingface.co/my-ai-stack](https://huggingface.co/my-ai-stack) Β· [stack-ai.me](https://www.stack-ai.me)* |