Stack 2.9 API Documentation
Complete API reference for integrating Stack 2.9 into your applications.
Table of Contents
- Overview
- Authentication
- Rate Limits
- REST Endpoints
- WebSocket Streaming
- Request/Response Formats
- Error Handling
- SDKs and Examples
Overview
Stack 2.9 provides an OpenAI-compatible API for seamless integration with existing tools and workflows.
Base URL
Production: https://api.stack2.9.openclaw.org/v1
Local: http://localhost:3000/v1
API Versioning
The current API version is v1. Version information is included in all responses.
{
"api_version": "1.0",
"deprecation_date": null
}
Authentication
API Key Authentication
Include your API key in the Authorization header:
curl -H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
https://api.stack2.9.openclaw.org/v1/chat/completions
Obtaining an API Key
- Self-hosted: Set
API_KEYenvironment variable - Cloud: Sign up at stack2.9.openclaw.org
Authentication Errors
| Status | Error Type | Description |
|---|---|---|
| 401 | invalid_api_key |
API key is missing or invalid |
| 403 | account_disabled |
Account has been disabled |
| 429 | rate_limit_exceeded |
Too many requests |
Rate Limits
Tier Limits
| Tier | Requests/min | Tokens/day | Concurrent | WebSocket |
|---|---|---|---|---|
| Free | 100 | 100,000 | 5 | ✅ |
| Pro | 1,000 | 10,000,000 | 20 | ✅ |
| Enterprise | Custom | Custom | Custom | ✅ |
Rate Limit Headers
Every response includes rate limit information:
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 95
X-RateLimit-Reset: 1640995200
X-RateLimit-Used: 5
Handling Rate Limits
import time
import openai
client = openai.OpenAI(api_key="your-api-key")
for i in range(100):
try:
response = client.chat.completions.create(
model="qwen/qwen2.5-coder-32b",
messages=[{"role": "user", "content": "Hello"}]
)
except openai.RateLimitError:
time.sleep(60) # Wait 1 minute
continue
REST Endpoints
Chat Completions
Endpoint: POST /chat/completions
Generate chat completions with optional streaming.
Request Body
{
"model": "qwen/qwen2.5-coder-32b",
"messages": [
{
"role": "system",
"content": "You are a helpful coding assistant."
},
{
"role": "user",
"content": "Write a Python function to calculate Fibonacci numbers."
}
],
"temperature": 0.7,
"max_tokens": 1000,
"top_p": 1.0,
"frequency_penalty": 0.0,
"presence_penalty": 0.0,
"stream": false,
"stop": null,
"tools": [],
"tool_choice": "auto",
"user": "user-identifier"
}
Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
model |
string | ✅ | - | Model identifier |
messages |
array | ✅ | - | Conversation messages |
temperature |
number | ❌ | 0.7 | Sampling temperature (0-2) |
max_tokens |
integer | ❌ | 1000 | Maximum tokens to generate |
top_p |
number | ❌ | 1.0 | Nucleus sampling |
frequency_penalty |
number | ❌ | 0.0 | Frequency penalty (-2 to 2) |
presence_penalty |
number | ❌ | 0.0 | Presence penalty (-2 to 2) |
stream |
boolean | ❌ | false | Enable streaming |
stop |
string/array | ❌ | null | Stop sequences |
tools |
array | ❌ | [] | Available tools |
tool_choice |
string | ❌ | "auto" | Tool selection strategy |
user |
string | ❌ | - | User identifier |
Response (Non-Streaming)
{
"id": "chatcmpl-1234567890",
"object": "chat.completion",
"created": 1234567890,
"model": "qwen/qwen2.5-coder-32b",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "def fibonacci(n):\n if n <= 1:\n return n\n return fibonacci(n-1) + fibonacci(n-2)"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 25,
"completion_tokens": 35,
"total_tokens": 60
},
"system_fingerprint": "fp_1234567890"
}
Streaming Response Format
curl -X POST http://localhost:3000/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "qwen/qwen2.5-coder-32b", "messages": [{"role": "user", "content": "Hello"}], "stream": true}'
{"id":"chatcmpl-123","object":"chat.completion.chunk","created":1234567890,"model":"qwen/qwen2.5-coder-32b","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}
{"id":"chatcmpl-123","object":"chat.completion.chunk","created":1234567890,"model":"qwen/qwen2.5-coder-32b","choices":[{"index":0,"delta":{"content":"!"},"finish_reason":null}]}
{"id":"chatcmpl-123","object":"chat.completion.chunk","created":1234567890,"model":"qwen/qwen2.5-coder-32b","choices":[{"index":0,"delta":{"content":" How"},"finish_reason":null}]}
{"id":"chatcmpl-123","object":"chat.completion.chunk","created":1234567890,"model":"qwen/qwen2.5-coder-32b","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}
Models List
Endpoint: GET /models
List available models.
Response
{
"object": "list",
"data": [
{
"id": "qwen/qwen2.5-coder-32b",
"object": "model",
"created": 1234567890,
"owned_by": "openclaw",
"permission": [],
"root": "qwen/qwen2.5-coder-32b",
"parent": null,
"context_window": 131072,
"Capabilities": {
"streaming": true,
"tools": true,
"voice": true
}
},
{
"id": "qwen/qwen2.5-coder-14b",
"object": "model",
"created": 1234567890,
"owned_by": "openclaw",
"permission": [],
"root": "qwen/qwen2.5-coder-14b",
"parent": null,
"context_window": 16384,
"capabilities": {
"streaming": true,
"tools": true,
"voice": false
}
}
]
}
Get Model
Endpoint: GET /models/{model_id}
Get details about a specific model.
curl http://localhost:3000/v1/models/qwen/qwen2.5-coder-32b
{
"id": "qwen/qwen2.5-coder-32b",
"object": "model",
"created": 1234567890,
"owned_by": "openclaw",
"context_window": 131072,
"capabilities": {
"streaming": true,
"tools": true,
"voice": true
}
}
WebSocket Streaming
Connection
const ws = new WebSocket('wss://api.stack2.9.openclaw.org/v1/ws/chat');
ws.onopen = () => {
ws.send(JSON.stringify({
type: 'start',
model: 'qwen/qwen2.5-coder-32b',
messages: [{role: 'user', content: 'Hello'}],
temperature: 0.7
}));
};
ws.onmessage = (event) => {
const data = JSON.parse(event.data);
console.log(data.content); // Streamed content
};
ws.onerror = (error) => {
console.error('WebSocket error:', error);
};
ws.onclose = () => {
console.log('Connection closed');
};
WebSocket Message Types
Client → Server
| Type | Description |
|---|---|
start |
Start a new chat session |
stop |
Stop current generation |
ping |
Keep-alive ping |
Server → Client
| Type | Description |
|---|---|
content |
Streamed content chunk |
tool_call |
Tool invocation request |
tool_result |
Tool execution result |
done |
Generation complete |
error |
Error occurred |
pong |
Keep-alive response |
Full WebSocket Example
class Stack29WebSocket {
constructor(apiKey, model = 'qwen/qwen2.5-coder-32b') {
this.apiKey = apiKey;
this.model = model;
this.ws = null;
}
connect() {
return new Promise((resolve, reject) => {
this.ws = new WebSocket('wss://api.stack2.9.openclaw.org/v1/ws/chat');
this.ws.onopen = () => resolve();
this.ws.onerror = (e) => reject(e);
});
}
async sendMessage(messages, onChunk, onComplete) {
await this.connect();
this.ws.onmessage = (event) => {
const data = JSON.parse(event.data);
if (data.type === 'content') {
onChunk(data.content);
} else if (data.type === 'done') {
onComplete(data);
this.ws.close();
} else if (data.type === 'error') {
console.error('Error:', data.message);
}
};
this.ws.send(JSON.stringify({
type: 'start',
model: this.model,
messages: messages,
temperature: 0.7
}));
}
}
// Usage
const client = new Stack29WebSocket('your-api-key');
client.sendMessage(
[{role: 'user', content: 'Write a hello world function'}],
(chunk) => process.stdout.write(chunk),
(final) => console.log('\n\nDone!')
);
Request/Response Formats
Message Format
{
"role": "user|assistant|system",
"content": "Message content",
"name": "optional-name",
"tool_calls": [],
"tool_call_id": "optional-id"
}
Tool Call Format
{
"type": "function",
"id": "call_abc123",
"function": {
"name": "execute_code",
"description": "Execute code in a sandboxed environment",
"parameters": {
"type": "object",
"properties": {
"code": {
"type": "string",
"description": "The code to execute"
},
"language": {
"type": "string",
"description": "Programming language"
}
},
"required": ["code", "language"]
}
}
}
Tool Call Response Format
{
"tool_call_id": "call_abc123",
"output": "Hello, World!",
"error": null,
"execution_time_ms": 150
}
Vision Support (Image Input)
{
"role": "user",
"content": [
{
"type": "text",
"text": "What does this code do?"
},
{
"type": "image_url",
"image_url": {
"url": "https://example.com/screenshot.png",
"detail": "low"
}
}
]
}
Error Handling
Error Response Format
{
"error": {
"message": "Invalid API key",
"type": "invalid_request_error",
"param": "authorization",
"code": 401
}
}
Error Codes
| Code | Type | HTTP Status | Description |
|---|---|---|---|
invalid_api_key |
Authentication | 401 | API key is invalid |
account_disabled |
Authentication | 403 | Account disabled |
rate_limit_exceeded |
Rate Limit | 429 | Too many requests |
context_length_exceeded |
Invalid Request | 400 | Context too long |
invalid_request |
Invalid Request | 400 | Malformed request |
model_not_found |
Invalid Request | 404 | Model doesn't exist |
tool_error |
Tool Error | 422 | Tool execution failed |
internal_error |
Server Error | 500 | Server-side error |
service_unavailable |
Server Error | 503 | Service temporarily down |
Handling Errors in Code
import openai
from openai import APIError, RateLimitError
client = openai.OpenAI(api_key="your-api-key")
try:
response = client.chat.completions.create(
model="qwen/qwen2.5-coder-32b",
messages=[{"role": "user", "content": "Hello"}]
)
except RateLimitError:
print("Rate limit exceeded. Please wait.")
# Implement backoff logic
except APIError as e:
print(f"API error: {e}")
# Handle error appropriately
try {
const response = await client.chat.completions.create({
model: 'qwen/qwen2.5-coder-32b',
messages: [{role: 'user', content: 'Hello'}]
});
} catch (error) {
if (error.status === 429) {
console.log('Rate limit exceeded');
} else if (error.status === 401) {
console.log('Invalid API key');
} else {
console.error('API error:', error.message);
}
}
SDKs and Examples
Python SDK
pip install openai
from openai import OpenAI
client = OpenAI(
api_key="your-api-key",
base_url="https://api.stack2.9.openclaw.org/v1"
)
# Non-streaming
response = client.chat.completions.create(
model="qwen/qwen2.5-coder-32b",
messages=[
{"role": "system", "content": "You are a coding assistant."},
{"role": "user", "content": "Write a Python class for a stack data structure."}
],
temperature=0.7,
max_tokens=1000
)
print(response.choices[0].message.content)
# Streaming
stream = client.chat.completions.create(
model="qwen/qwen2.5-coder-32b",
messages=[{"role": "user", "content": "Explain recursion"}],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="", flush=True)
JavaScript/Node.js SDK
npm install openai
import OpenAI from 'openai';
const client = new OpenAI({
apiKey: 'your-api-key',
baseURL: 'https://api.stack2.9.openclaw.org/v1'
});
// Non-streaming
const response = await client.chat.completions.create({
model: 'qwen/qwen2.5-coder-32b',
messages: [
{role: 'system', content: 'You are a coding assistant.'},
{role: 'user', content: 'Write a Python class for a stack data structure.'}
],
temperature: 0.7,
max_tokens: 1000
});
console.log(response.choices[0].message.content);
// Streaming
const stream = await client.chat.completions.create({
model: 'qwen/qwen2.5-coder-32b',
messages: [{role: 'user', content: 'Explain recursion'}],
stream: true
});
for await (const chunk of stream) {
process.stdout.write(chunk.choices[0].delta.content || '');
}
cURL Examples
# Basic chat completion
curl -X POST https://api.stack2.9.openclaw.org/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen/qwen2.5-coder-32b",
"messages": [{"role": "user", "content": "Hello"}]
}'
# Streaming completion
curl -X POST https://api.stack2.9.openclaw.org/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "qwen/qwen2.5-coder-32b", "messages": [{"role": "user", "content": "Hello"}], "stream": true}'
# With system prompt
curl -X POST https://api.stack2.9.openclaw.org/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen/qwen2.5-coder-32b",
"messages": [
{"role": "system", "content": "You are an expert Python programmer."},
{"role": "user", "content": "Write a decorator that caches function results."}
]
}'
# With tools
curl -X POST https://api.stack2.9.openclaw.org/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen/qwen2.5-coder-32b",
"messages": [{"role": "user", "content": "Create a file called hello.py"}],
"tools": [
{
"type": "function",
"function": {
"name": "write_file",
"description": "Write content to a file",
"parameters": {
"type": "object",
"properties": {
"path": {"type": "string"},
"content": {"type": "string"}
}
}
}
}
]
}'
OpenAI-Compatible Client Usage
Stack 2.9 is compatible with the OpenAI client library:
# Works with LangChain
from langchain.chat_models import ChatOpenAI
from langchain.schema import HumanMessage
chat = ChatOpenAI(
model="qwen/qwen2.5-coder-32b",
openai_api_base="https://api.stack2.9.openclaw.org/v1",
openai_api_key="your-api-key"
)
response = chat([HumanMessage(content="Hello!")])
Webhooks
Configure webhooks for asynchronous events:
{
"webhook_url": "https://your-server.com/webhook",
"events": [
"tool_call.started",
"tool_call.completed",
"tool_call.failed",
"generation.done"
]
}
Webhook Payload
{
"event": "tool_call.completed",
"timestamp": "2026-04-01T12:00:00Z",
"data": {
"tool_call_id": "call_abc123",
"tool_name": "execute_code",
"execution_time_ms": 150,
"result": "Hello, World!"
}
}
Best Practices
1. Use Streaming for Better UX
Always use streaming for long-form content to provide real-time feedback to users.
2. Implement Proper Error Handling
Always handle rate limits and authentication errors gracefully with exponential backoff.
3. Cache Responses
Cache frequent queries to reduce API calls and improve response times.
from functools import lru_cache
@lru_cache(maxsize=100)
def cached_completion(prompt_hash):
# Implement caching logic
pass
4. Use Appropriate Temperature
| Task | Temperature |
|---|---|
| Code generation | 0.0 - 0.3 |
| Factual Q&A | 0.0 - 0.2 |
| Creative writing | 0.7 - 1.0 |
| brainstorming | 0.8 - 1.2 |
5. Monitor Token Usage
Track usage to stay within rate limits:
response = client.chat.completions.create(...)
print(f"Tokens used: {response.usage.total_tokens}")
Support
- Documentation: docs/index.html
- API Status: status.stack2.9.openclaw.org
- Issues: GitHub Issues
- Email: api@stack2.9.openclaw.org
API Version: 1.0
Last Updated: 2026-04-01
Status: Active