Spaces:

my-ai-stack
/

Stack-X-Ultimate-Inference

Running

App Files Files Community

Stack-X-Ultimate-Inference / README.md

Welly-code

Update README.md

ae2ecf1 verified 3 days ago

preview code

raw

history blame contribute delete

3.37 kB

	---
	title: Stack X Ultimate
	emoji: 🔥
	colorFrom: purple
	colorTo: blue
	sdk: gradio
	sdk_version: 6.13.0
	app_file: app.py
	pinned: false
	description: >-
	Open-source agentic model with tool calling. Deploy on your own GPU — no API
	costs.
	tags:
	- agentic
	- tool-calling
	- llm
	- qwen
	- open-source
	- local-ai
	license: apache-2.0
	---
	# Stack X Ultimate — Agentic Tool-Calling Model

	<div align="center">

	Open-source agentic model that calls real tools. Deploy on your GPU — no API key required.

	[![Model](https://img.shields.io/badge/Model-Qwen2.5--Coder--3B--Instruct-blue)](https://huggingface.co/Qwen/Qwen2.5-Coder-3B-Instruct)
	[![Quantization](https://img.shields.io/badge/Quantization-QLoRA%204--bit-purple)](https://github.com/QwenLM/Qwen2.5-Coder)
	[![License](https://img.shields.io/badge/License-Apache%202.0-green)](LICENSE)

	</div>

	---

	## What It Does

	Stack X Ultimate is a fine-tuned Qwen2.5-Coder-3B-Instruct optimized for agentic tool-calling workflows:

	- 🔢 Calculator — evaluates mathematical expressions
	- 🕐 Current Time — returns live UTC timestamp
	- 📁 File Search — glob-based file discovery in directory trees
	- ⚡ Command Execution — runs shell commands and returns structured output

	The model decides when to call tools and how to interpret results — mirroring how GPT-4 and Claude handle function calling, but running entirely on your infrastructure.

	---

	## Quick Start

	### Try in Browser
	Use the chat interface above — try the pre-loaded examples or type your own query.

	### Run Locally

	```bash
	# Clone the model
	git lfs install
	git clone https://huggingface.co/my-ai-stack/Stack-X-Ultimate

	# Run with llama.cpp
	./llama.cpp -m ./Stack-X-Ultimate/unsloth.Q4_K_M.gguf \
	-n 8192 \
	--ctx-size 8192 \
	-p "You are a helpful AI assistant with tool calling."

	# Or with transformers
	python3 -c "
	from transformers import AutoModelForCausalLM, AutoTokenizer
	model = AutoModelForCausalLM.from_pretrained('my-ai-stack/Stack-X-Ultimate')
	tok = AutoTokenizer.from_pretrained('my-ai-stack/Stack-X-Ultimate')
	print('Model loaded!')
	"
	```

	---

	## Architecture

	\| Component \| Detail \|
	\|---\|---\|
	\| Base Model \| Qwen/Qwen2.5-Coder-3B-Instruct \|
	\| Fine-tune Method \| QLoRA 4-bit (r=32, all 7 target modules) \|
	\| Training Data \| 27,000+ agentic tool-call examples \|
	\| Sequence Length \| 8,192 tokens \|
	\| VRAM Required \| ~6 GB (Q4_K_M GGUF) \|
	\| Min. Hardware \| Single V100 16GB or equivalent \|

	---

	## Available Tools

	\| Tool \| Description \| Example \|
	\|---\|---\|---\|
	\| `calculator` \| Evaluate math expressions \| `1500 * 0.07 * 30` \|
	\| `get_current_time` \| Return current UTC time \| — \|
	\| `search_files` \| Glob-based file search \| `*.py` in `./src` \|
	\| `run_command` \| Execute shell commands \| `git status` \|

	---

	## Enterprise Deployment

	Need tool integrations specific to your stack? Want to deploy inside your VPC?

	📬 [Contact Stack AI →](https://www.stack-ai.me/contact)

	- Custom tool integrations (APIs, databases, internal systems)
	- VPC-isolated deployment (AWS / GCP / Azure)
	- Air-gapped / on-prem installation
	- Custom LoRA fine-tuning on your data

	---

	## License

	Apache 2.0 — free for commercial and personal use.

	---

	Stack AI — Sovereign AI Infrastructure. [huggingface.co/my-ai-stack](https://huggingface.co/my-ai-stack) · [stack-ai.me](https://www.stack-ai.me)