Spaces:

my-ai-stack
/

Stack-X-Ultimate-Inference

Running

App Files Files Community

Walid Sobhi commited on 3 days ago

Commit

bf0e3a6

verified ·

1 Parent(s): 50d4214

Add Space README

Browse files

Files changed (1) hide show

README.md +113 -6

README.md CHANGED Viewed

@@ -1,12 +1,119 @@
 ---
-title: Stack X Ultimate Inference
-emoji: 🐠
-colorFrom: pink
-colorTo: indigo
 sdk: gradio
-sdk_version: 6.13.0
 app_file: app.py
 pinned: false
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
+title: Stack X Ultimate
+emoji: 🤖
+colorFrom: purple
+colorTo: blue
 sdk: gradio
+sdk_version: 5.16.0
 app_file: app.py
 pinned: false
+description: Open-source agentic model with tool calling. Deploy on your own GPU — no API costs.
+tags:
+- agentic
+- tool-calling
+- llm
+- qwen
+- open-source
+- local-ai
+license: apache-2.0
 ---
+# Stack X Ultimate — Agentic Tool-Calling Model
+<div align="center">
+**Open-source agentic model that calls real tools. Deploy on your GPU — no API key required.**
+[![Model](https://img.shields.io/badge/Model-Qwen2.5--Coder--3B--Instruct-blue)](https://huggingface.co/Qwen/Qwen2.5-Coder-3B-Instruct)
+[![Quantization](https://img.shields.io/badge/Quantization-QLoRA%204--bit-purple)](https://github.com/QwenLM/Qwen2.5-Coder)
+[![License](https://img.shields.io/badge/License-Apache%202.0-green)](LICENSE)
+</div>
+---
+## What It Does
+Stack X Ultimate is a fine-tuned Qwen2.5-Coder-3B-Instruct optimized for **agentic tool-calling workflows**:
+- 🔢 **Calculator** — evaluates mathematical expressions
+- 🕐 **Current Time** — returns live UTC timestamp
+- 📁 **File Search** — glob-based file discovery in directory trees
+- ⚡ **Command Execution** — runs shell commands and returns structured output
+The model decides *when* to call tools and *how* to interpret results — mirroring how GPT-4 and Claude handle function calling, but running entirely on your infrastructure.
+---
+## Quick Start
+### Try in Browser
+Use the chat interface above — try the pre-loaded examples or type your own query.
+### Run Locally
+```bash
+# Clone the model
+git lfs install
+git clone https://huggingface.co/my-ai-stack/Stack-X-Ultimate
+# Run with llama.cpp
+./llama.cpp -m ./Stack-X-Ultimate/unsloth.Q4_K_M.gguf \
+  -n 8192 \
+  --ctx-size 8192 \
+  -p "You are a helpful AI assistant with tool calling."
+# Or with transformers
+python3 -c "
+from transformers import AutoModelForCausalLM, AutoTokenizer
+model = AutoModelForCausalLM.from_pretrained('my-ai-stack/Stack-X-Ultimate')
+tok = AutoTokenizer.from_pretrained('my-ai-stack/Stack-X-Ultimate')
+print('Model loaded!')
+"
+```
+---
+## Architecture
+| Component | Detail |
+|---|---|
+| Base Model | Qwen/Qwen2.5-Coder-3B-Instruct |
+| Fine-tune Method | QLoRA 4-bit (r=32, all 7 target modules) |
+| Training Data | 27,000+ agentic tool-call examples |
+| Sequence Length | 8,192 tokens |
+| VRAM Required | ~6 GB (Q4_K_M GGUF) |
+| Min. Hardware | Single V100 16GB or equivalent |
+---
+## Available Tools
+| Tool | Description | Example |
+|---|---|---|
+| `calculator` | Evaluate math expressions | `1500 * 0.07 * 30` |
+| `get_current_time` | Return current UTC time | — |
+| `search_files` | Glob-based file search | `*.py` in `./src` |
+| `run_command` | Execute shell commands | `git status` |
+---
+## Enterprise Deployment
+Need tool integrations specific to your stack? Want to deploy inside your VPC?
+📬 **[Contact Stack AI →](https://www.stack-ai.me/contact)**
+- Custom tool integrations (APIs, databases, internal systems)
+- VPC-isolated deployment (AWS / GCP / Azure)
+- Air-gapped / on-prem installation
+- Custom LoRA fine-tuning on your data
+---
+## License
+Apache 2.0 — free for commercial and personal use.
+---
+*Stack AI — Sovereign AI Infrastructure. [huggingface.co/my-ai-stack](https://huggingface.co/my-ai-stack) · [stack-ai.me](https://www.stack-ai.me)*