walidsobhie-code
refactor: Squeeze folders further - cleaner structure
65888d5
# Stack 2.9 API Documentation
Complete API reference for integrating Stack 2.9 into your applications.
## Table of Contents
- [Overview](#overview)
- [Authentication](#authentication)
- [Rate Limits](#rate-limits)
- [REST Endpoints](#rest-endpoints)
- [WebSocket Streaming](#websocket-streaming)
- [Request/Response Formats](#requestresponse-formats)
- [Error Handling](#error-handling)
- [SDKs and Examples](#sdks-and-examples)
---
## Overview
Stack 2.9 provides an OpenAI-compatible API for seamless integration with existing tools and workflows.
### Base URL
```
Production: https://api.stack2.9.openclaw.org/v1
Local: http://localhost:3000/v1
```
### API Versioning
The current API version is `v1`. Version information is included in all responses.
```json
{
"api_version": "1.0",
"deprecation_date": null
}
```
---
## Authentication
### API Key Authentication
Include your API key in the `Authorization` header:
```bash
curl -H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
https://api.stack2.9.openclaw.org/v1/chat/completions
```
### Obtaining an API Key
1. **Self-hosted:** Set `API_KEY` environment variable
2. **Cloud:** Sign up at [stack2.9.openclaw.org](https://stack2.9.openclaw.org)
### Authentication Errors
| Status | Error Type | Description |
|--------|------------|-------------|
| 401 | `invalid_api_key` | API key is missing or invalid |
| 403 | `account_disabled` | Account has been disabled |
| 429 | `rate_limit_exceeded` | Too many requests |
---
## Rate Limits
### Tier Limits
| Tier | Requests/min | Tokens/day | Concurrent | WebSocket |
|------|-------------|------------|------------|-----------|
| **Free** | 100 | 100,000 | 5 | ✅ |
| **Pro** | 1,000 | 10,000,000 | 20 | ✅ |
| **Enterprise** | Custom | Custom | Custom | ✅ |
### Rate Limit Headers
Every response includes rate limit information:
```
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 95
X-RateLimit-Reset: 1640995200
X-RateLimit-Used: 5
```
### Handling Rate Limits
```python
import time
import openai
client = openai.OpenAI(api_key="your-api-key")
for i in range(100):
try:
response = client.chat.completions.create(
model="qwen/qwen2.5-coder-32b",
messages=[{"role": "user", "content": "Hello"}]
)
except openai.RateLimitError:
time.sleep(60) # Wait 1 minute
continue
```
---
## REST Endpoints
### Chat Completions
**Endpoint:** `POST /chat/completions`
Generate chat completions with optional streaming.
#### Request Body
```json
{
"model": "qwen/qwen2.5-coder-32b",
"messages": [
{
"role": "system",
"content": "You are a helpful coding assistant."
},
{
"role": "user",
"content": "Write a Python function to calculate Fibonacci numbers."
}
],
"temperature": 0.7,
"max_tokens": 1000,
"top_p": 1.0,
"frequency_penalty": 0.0,
"presence_penalty": 0.0,
"stream": false,
"stop": null,
"tools": [],
"tool_choice": "auto",
"user": "user-identifier"
}
```
#### Parameters
| Parameter | Type | Required | Default | Description |
|-----------|------|----------|---------|-------------|
| `model` | string | ✅ | - | Model identifier |
| `messages` | array | ✅ | - | Conversation messages |
| `temperature` | number | ❌ | 0.7 | Sampling temperature (0-2) |
| `max_tokens` | integer | ❌ | 1000 | Maximum tokens to generate |
| `top_p` | number | ❌ | 1.0 | Nucleus sampling |
| `frequency_penalty` | number | ❌ | 0.0 | Frequency penalty (-2 to 2) |
| `presence_penalty` | number | ❌ | 0.0 | Presence penalty (-2 to 2) |
| `stream` | boolean | ❌ | false | Enable streaming |
| `stop` | string/array | ❌ | null | Stop sequences |
| `tools` | array | ❌ | [] | Available tools |
| `tool_choice` | string | ❌ | "auto" | Tool selection strategy |
| `user` | string | ❌ | - | User identifier |
#### Response (Non-Streaming)
```json
{
"id": "chatcmpl-1234567890",
"object": "chat.completion",
"created": 1234567890,
"model": "qwen/qwen2.5-coder-32b",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "def fibonacci(n):\n if n <= 1:\n return n\n return fibonacci(n-1) + fibonacci(n-2)"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 25,
"completion_tokens": 35,
"total_tokens": 60
},
"system_fingerprint": "fp_1234567890"
}
```
#### Streaming Response Format
```bash
curl -X POST http://localhost:3000/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "qwen/qwen2.5-coder-32b", "messages": [{"role": "user", "content": "Hello"}], "stream": true}'
```
```json
{"id":"chatcmpl-123","object":"chat.completion.chunk","created":1234567890,"model":"qwen/qwen2.5-coder-32b","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}
{"id":"chatcmpl-123","object":"chat.completion.chunk","created":1234567890,"model":"qwen/qwen2.5-coder-32b","choices":[{"index":0,"delta":{"content":"!"},"finish_reason":null}]}
{"id":"chatcmpl-123","object":"chat.completion.chunk","created":1234567890,"model":"qwen/qwen2.5-coder-32b","choices":[{"index":0,"delta":{"content":" How"},"finish_reason":null}]}
{"id":"chatcmpl-123","object":"chat.completion.chunk","created":1234567890,"model":"qwen/qwen2.5-coder-32b","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}
```
---
### Models List
**Endpoint:** `GET /models`
List available models.
#### Response
```json
{
"object": "list",
"data": [
{
"id": "qwen/qwen2.5-coder-32b",
"object": "model",
"created": 1234567890,
"owned_by": "openclaw",
"permission": [],
"root": "qwen/qwen2.5-coder-32b",
"parent": null,
"context_window": 131072,
"Capabilities": {
"streaming": true,
"tools": true,
"voice": true
}
},
{
"id": "qwen/qwen2.5-coder-14b",
"object": "model",
"created": 1234567890,
"owned_by": "openclaw",
"permission": [],
"root": "qwen/qwen2.5-coder-14b",
"parent": null,
"context_window": 16384,
"capabilities": {
"streaming": true,
"tools": true,
"voice": false
}
}
]
}
```
### Get Model
**Endpoint:** `GET /models/{model_id}`
Get details about a specific model.
```bash
curl http://localhost:3000/v1/models/qwen/qwen2.5-coder-32b
```
```json
{
"id": "qwen/qwen2.5-coder-32b",
"object": "model",
"created": 1234567890,
"owned_by": "openclaw",
"context_window": 131072,
"capabilities": {
"streaming": true,
"tools": true,
"voice": true
}
}
```
---
## WebSocket Streaming
### Connection
```javascript
const ws = new WebSocket('wss://api.stack2.9.openclaw.org/v1/ws/chat');
ws.onopen = () => {
ws.send(JSON.stringify({
type: 'start',
model: 'qwen/qwen2.5-coder-32b',
messages: [{role: 'user', content: 'Hello'}],
temperature: 0.7
}));
};
ws.onmessage = (event) => {
const data = JSON.parse(event.data);
console.log(data.content); // Streamed content
};
ws.onerror = (error) => {
console.error('WebSocket error:', error);
};
ws.onclose = () => {
console.log('Connection closed');
};
```
### WebSocket Message Types
#### Client → Server
| Type | Description |
|------|-------------|
| `start` | Start a new chat session |
| `stop` | Stop current generation |
| `ping` | Keep-alive ping |
#### Server → Client
| Type | Description |
|------|-------------|
| `content` | Streamed content chunk |
| `tool_call` | Tool invocation request |
| `tool_result` | Tool execution result |
| `done` | Generation complete |
| `error` | Error occurred |
| `pong` | Keep-alive response |
### Full WebSocket Example
```javascript
class Stack29WebSocket {
constructor(apiKey, model = 'qwen/qwen2.5-coder-32b') {
this.apiKey = apiKey;
this.model = model;
this.ws = null;
}
connect() {
return new Promise((resolve, reject) => {
this.ws = new WebSocket('wss://api.stack2.9.openclaw.org/v1/ws/chat');
this.ws.onopen = () => resolve();
this.ws.onerror = (e) => reject(e);
});
}
async sendMessage(messages, onChunk, onComplete) {
await this.connect();
this.ws.onmessage = (event) => {
const data = JSON.parse(event.data);
if (data.type === 'content') {
onChunk(data.content);
} else if (data.type === 'done') {
onComplete(data);
this.ws.close();
} else if (data.type === 'error') {
console.error('Error:', data.message);
}
};
this.ws.send(JSON.stringify({
type: 'start',
model: this.model,
messages: messages,
temperature: 0.7
}));
}
}
// Usage
const client = new Stack29WebSocket('your-api-key');
client.sendMessage(
[{role: 'user', content: 'Write a hello world function'}],
(chunk) => process.stdout.write(chunk),
(final) => console.log('\n\nDone!')
);
```
---
## Request/Response Formats
### Message Format
```json
{
"role": "user|assistant|system",
"content": "Message content",
"name": "optional-name",
"tool_calls": [],
"tool_call_id": "optional-id"
}
```
### Tool Call Format
```json
{
"type": "function",
"id": "call_abc123",
"function": {
"name": "execute_code",
"description": "Execute code in a sandboxed environment",
"parameters": {
"type": "object",
"properties": {
"code": {
"type": "string",
"description": "The code to execute"
},
"language": {
"type": "string",
"description": "Programming language"
}
},
"required": ["code", "language"]
}
}
}
```
### Tool Call Response Format
```json
{
"tool_call_id": "call_abc123",
"output": "Hello, World!",
"error": null,
"execution_time_ms": 150
}
```
### Vision Support (Image Input)
```json
{
"role": "user",
"content": [
{
"type": "text",
"text": "What does this code do?"
},
{
"type": "image_url",
"image_url": {
"url": "https://example.com/screenshot.png",
"detail": "low"
}
}
]
}
```
---
## Error Handling
### Error Response Format
```json
{
"error": {
"message": "Invalid API key",
"type": "invalid_request_error",
"param": "authorization",
"code": 401
}
}
```
### Error Codes
| Code | Type | HTTP Status | Description |
|------|------|-------------|-------------|
| `invalid_api_key` | Authentication | 401 | API key is invalid |
| `account_disabled` | Authentication | 403 | Account disabled |
| `rate_limit_exceeded` | Rate Limit | 429 | Too many requests |
| `context_length_exceeded` | Invalid Request | 400 | Context too long |
| `invalid_request` | Invalid Request | 400 | Malformed request |
| `model_not_found` | Invalid Request | 404 | Model doesn't exist |
| `tool_error` | Tool Error | 422 | Tool execution failed |
| `internal_error` | Server Error | 500 | Server-side error |
| `service_unavailable` | Server Error | 503 | Service temporarily down |
### Handling Errors in Code
```python
import openai
from openai import APIError, RateLimitError
client = openai.OpenAI(api_key="your-api-key")
try:
response = client.chat.completions.create(
model="qwen/qwen2.5-coder-32b",
messages=[{"role": "user", "content": "Hello"}]
)
except RateLimitError:
print("Rate limit exceeded. Please wait.")
# Implement backoff logic
except APIError as e:
print(f"API error: {e}")
# Handle error appropriately
```
```javascript
try {
const response = await client.chat.completions.create({
model: 'qwen/qwen2.5-coder-32b',
messages: [{role: 'user', content: 'Hello'}]
});
} catch (error) {
if (error.status === 429) {
console.log('Rate limit exceeded');
} else if (error.status === 401) {
console.log('Invalid API key');
} else {
console.error('API error:', error.message);
}
}
```
---
## SDKs and Examples
### Python SDK
```bash
pip install openai
```
```python
from openai import OpenAI
client = OpenAI(
api_key="your-api-key",
base_url="https://api.stack2.9.openclaw.org/v1"
)
# Non-streaming
response = client.chat.completions.create(
model="qwen/qwen2.5-coder-32b",
messages=[
{"role": "system", "content": "You are a coding assistant."},
{"role": "user", "content": "Write a Python class for a stack data structure."}
],
temperature=0.7,
max_tokens=1000
)
print(response.choices[0].message.content)
# Streaming
stream = client.chat.completions.create(
model="qwen/qwen2.5-coder-32b",
messages=[{"role": "user", "content": "Explain recursion"}],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="", flush=True)
```
### JavaScript/Node.js SDK
```bash
npm install openai
```
```javascript
import OpenAI from 'openai';
const client = new OpenAI({
apiKey: 'your-api-key',
baseURL: 'https://api.stack2.9.openclaw.org/v1'
});
// Non-streaming
const response = await client.chat.completions.create({
model: 'qwen/qwen2.5-coder-32b',
messages: [
{role: 'system', content: 'You are a coding assistant.'},
{role: 'user', content: 'Write a Python class for a stack data structure.'}
],
temperature: 0.7,
max_tokens: 1000
});
console.log(response.choices[0].message.content);
// Streaming
const stream = await client.chat.completions.create({
model: 'qwen/qwen2.5-coder-32b',
messages: [{role: 'user', content: 'Explain recursion'}],
stream: true
});
for await (const chunk of stream) {
process.stdout.write(chunk.choices[0].delta.content || '');
}
```
### cURL Examples
```bash
# Basic chat completion
curl -X POST https://api.stack2.9.openclaw.org/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen/qwen2.5-coder-32b",
"messages": [{"role": "user", "content": "Hello"}]
}'
# Streaming completion
curl -X POST https://api.stack2.9.openclaw.org/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "qwen/qwen2.5-coder-32b", "messages": [{"role": "user", "content": "Hello"}], "stream": true}'
# With system prompt
curl -X POST https://api.stack2.9.openclaw.org/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen/qwen2.5-coder-32b",
"messages": [
{"role": "system", "content": "You are an expert Python programmer."},
{"role": "user", "content": "Write a decorator that caches function results."}
]
}'
# With tools
curl -X POST https://api.stack2.9.openclaw.org/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen/qwen2.5-coder-32b",
"messages": [{"role": "user", "content": "Create a file called hello.py"}],
"tools": [
{
"type": "function",
"function": {
"name": "write_file",
"description": "Write content to a file",
"parameters": {
"type": "object",
"properties": {
"path": {"type": "string"},
"content": {"type": "string"}
}
}
}
}
]
}'
```
### OpenAI-Compatible Client Usage
Stack 2.9 is compatible with the OpenAI client library:
```python
# Works with LangChain
from langchain.chat_models import ChatOpenAI
from langchain.schema import HumanMessage
chat = ChatOpenAI(
model="qwen/qwen2.5-coder-32b",
openai_api_base="https://api.stack2.9.openclaw.org/v1",
openai_api_key="your-api-key"
)
response = chat([HumanMessage(content="Hello!")])
```
---
## Webhooks
Configure webhooks for asynchronous events:
```json
{
"webhook_url": "https://your-server.com/webhook",
"events": [
"tool_call.started",
"tool_call.completed",
"tool_call.failed",
"generation.done"
]
}
```
### Webhook Payload
```json
{
"event": "tool_call.completed",
"timestamp": "2026-04-01T12:00:00Z",
"data": {
"tool_call_id": "call_abc123",
"tool_name": "execute_code",
"execution_time_ms": 150,
"result": "Hello, World!"
}
}
```
---
## Best Practices
### 1. Use Streaming for Better UX
Always use streaming for long-form content to provide real-time feedback to users.
### 2. Implement Proper Error Handling
Always handle rate limits and authentication errors gracefully with exponential backoff.
### 3. Cache Responses
Cache frequent queries to reduce API calls and improve response times.
```python
from functools import lru_cache
@lru_cache(maxsize=100)
def cached_completion(prompt_hash):
# Implement caching logic
pass
```
### 4. Use Appropriate Temperature
| Task | Temperature |
|------|-------------|
| Code generation | 0.0 - 0.3 |
| Factual Q&A | 0.0 - 0.2 |
| Creative writing | 0.7 - 1.0 |
| brainstorming | 0.8 - 1.2 |
### 5. Monitor Token Usage
Track usage to stay within rate limits:
```python
response = client.chat.completions.create(...)
print(f"Tokens used: {response.usage.total_tokens}")
```
---
## Support
- **Documentation**: [docs/index.html](docs/index.html)
- **API Status**: [status.stack2.9.openclaw.org](https://status.stack2.9.openclaw.org)
- **Issues**: [GitHub Issues](https://github.com/openclaw/stack-2.9/issues)
- **Email**: api@stack2.9.openclaw.org
---
**API Version**: 1.0
**Last Updated**: 2026-04-01
**Status**: Active