walidsobhie-code
refactor: Squeeze folders further - cleaner structure
65888d5

Stack 2.9 API Documentation

Complete API reference for integrating Stack 2.9 into your applications.

Table of Contents


Overview

Stack 2.9 provides an OpenAI-compatible API for seamless integration with existing tools and workflows.

Base URL

Production: https://api.stack2.9.openclaw.org/v1
Local:      http://localhost:3000/v1

API Versioning

The current API version is v1. Version information is included in all responses.

{
  "api_version": "1.0",
  "deprecation_date": null
}

Authentication

API Key Authentication

Include your API key in the Authorization header:

curl -H "Authorization: Bearer YOUR_API_KEY" \
     -H "Content-Type: application/json" \
     https://api.stack2.9.openclaw.org/v1/chat/completions

Obtaining an API Key

  1. Self-hosted: Set API_KEY environment variable
  2. Cloud: Sign up at stack2.9.openclaw.org

Authentication Errors

Status Error Type Description
401 invalid_api_key API key is missing or invalid
403 account_disabled Account has been disabled
429 rate_limit_exceeded Too many requests

Rate Limits

Tier Limits

Tier Requests/min Tokens/day Concurrent WebSocket
Free 100 100,000 5
Pro 1,000 10,000,000 20
Enterprise Custom Custom Custom

Rate Limit Headers

Every response includes rate limit information:

X-RateLimit-Limit: 100
X-RateLimit-Remaining: 95
X-RateLimit-Reset: 1640995200
X-RateLimit-Used: 5

Handling Rate Limits

import time
import openai

client = openai.OpenAI(api_key="your-api-key")

for i in range(100):
    try:
        response = client.chat.completions.create(
            model="qwen/qwen2.5-coder-32b",
            messages=[{"role": "user", "content": "Hello"}]
        )
    except openai.RateLimitError:
        time.sleep(60)  # Wait 1 minute
        continue

REST Endpoints

Chat Completions

Endpoint: POST /chat/completions

Generate chat completions with optional streaming.

Request Body

{
  "model": "qwen/qwen2.5-coder-32b",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful coding assistant."
    },
    {
      "role": "user",
      "content": "Write a Python function to calculate Fibonacci numbers."
    }
  ],
  "temperature": 0.7,
  "max_tokens": 1000,
  "top_p": 1.0,
  "frequency_penalty": 0.0,
  "presence_penalty": 0.0,
  "stream": false,
  "stop": null,
  "tools": [],
  "tool_choice": "auto",
  "user": "user-identifier"
}

Parameters

Parameter Type Required Default Description
model string - Model identifier
messages array - Conversation messages
temperature number 0.7 Sampling temperature (0-2)
max_tokens integer 1000 Maximum tokens to generate
top_p number 1.0 Nucleus sampling
frequency_penalty number 0.0 Frequency penalty (-2 to 2)
presence_penalty number 0.0 Presence penalty (-2 to 2)
stream boolean false Enable streaming
stop string/array null Stop sequences
tools array [] Available tools
tool_choice string "auto" Tool selection strategy
user string - User identifier

Response (Non-Streaming)

{
  "id": "chatcmpl-1234567890",
  "object": "chat.completion",
  "created": 1234567890,
  "model": "qwen/qwen2.5-coder-32b",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "def fibonacci(n):\n    if n <= 1:\n        return n\n    return fibonacci(n-1) + fibonacci(n-2)"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 35,
    "total_tokens": 60
  },
  "system_fingerprint": "fp_1234567890"
}

Streaming Response Format

curl -X POST http://localhost:3000/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "qwen/qwen2.5-coder-32b", "messages": [{"role": "user", "content": "Hello"}], "stream": true}'
{"id":"chatcmpl-123","object":"chat.completion.chunk","created":1234567890,"model":"qwen/qwen2.5-coder-32b","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}
{"id":"chatcmpl-123","object":"chat.completion.chunk","created":1234567890,"model":"qwen/qwen2.5-coder-32b","choices":[{"index":0,"delta":{"content":"!"},"finish_reason":null}]}
{"id":"chatcmpl-123","object":"chat.completion.chunk","created":1234567890,"model":"qwen/qwen2.5-coder-32b","choices":[{"index":0,"delta":{"content":" How"},"finish_reason":null}]}
{"id":"chatcmpl-123","object":"chat.completion.chunk","created":1234567890,"model":"qwen/qwen2.5-coder-32b","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}

Models List

Endpoint: GET /models

List available models.

Response

{
  "object": "list",
  "data": [
    {
      "id": "qwen/qwen2.5-coder-32b",
      "object": "model",
      "created": 1234567890,
      "owned_by": "openclaw",
      "permission": [],
      "root": "qwen/qwen2.5-coder-32b",
      "parent": null,
      "context_window": 131072,
      "Capabilities": {
        "streaming": true,
        "tools": true,
        "voice": true
      }
    },
    {
      "id": "qwen/qwen2.5-coder-14b",
      "object": "model",
      "created": 1234567890,
      "owned_by": "openclaw",
      "permission": [],
      "root": "qwen/qwen2.5-coder-14b",
      "parent": null,
      "context_window": 16384,
      "capabilities": {
        "streaming": true,
        "tools": true,
        "voice": false
      }
    }
  ]
}

Get Model

Endpoint: GET /models/{model_id}

Get details about a specific model.

curl http://localhost:3000/v1/models/qwen/qwen2.5-coder-32b
{
  "id": "qwen/qwen2.5-coder-32b",
  "object": "model",
  "created": 1234567890,
  "owned_by": "openclaw",
  "context_window": 131072,
  "capabilities": {
    "streaming": true,
    "tools": true,
    "voice": true
  }
}

WebSocket Streaming

Connection

const ws = new WebSocket('wss://api.stack2.9.openclaw.org/v1/ws/chat');

ws.onopen = () => {
  ws.send(JSON.stringify({
    type: 'start',
    model: 'qwen/qwen2.5-coder-32b',
    messages: [{role: 'user', content: 'Hello'}],
    temperature: 0.7
  }));
};

ws.onmessage = (event) => {
  const data = JSON.parse(event.data);
  console.log(data.content);  // Streamed content
};

ws.onerror = (error) => {
  console.error('WebSocket error:', error);
};

ws.onclose = () => {
  console.log('Connection closed');
};

WebSocket Message Types

Client → Server

Type Description
start Start a new chat session
stop Stop current generation
ping Keep-alive ping

Server → Client

Type Description
content Streamed content chunk
tool_call Tool invocation request
tool_result Tool execution result
done Generation complete
error Error occurred
pong Keep-alive response

Full WebSocket Example

class Stack29WebSocket {
  constructor(apiKey, model = 'qwen/qwen2.5-coder-32b') {
    this.apiKey = apiKey;
    this.model = model;
    this.ws = null;
  }

  connect() {
    return new Promise((resolve, reject) => {
      this.ws = new WebSocket('wss://api.stack2.9.openclaw.org/v1/ws/chat');
      
      this.ws.onopen = () => resolve();
      this.ws.onerror = (e) => reject(e);
    });
  }

  async sendMessage(messages, onChunk, onComplete) {
    await this.connect();
    
    this.ws.onmessage = (event) => {
      const data = JSON.parse(event.data);
      
      if (data.type === 'content') {
        onChunk(data.content);
      } else if (data.type === 'done') {
        onComplete(data);
        this.ws.close();
      } else if (data.type === 'error') {
        console.error('Error:', data.message);
      }
    };

    this.ws.send(JSON.stringify({
      type: 'start',
      model: this.model,
      messages: messages,
      temperature: 0.7
    }));
  }
}

// Usage
const client = new Stack29WebSocket('your-api-key');
client.sendMessage(
  [{role: 'user', content: 'Write a hello world function'}],
  (chunk) => process.stdout.write(chunk),
  (final) => console.log('\n\nDone!')
);

Request/Response Formats

Message Format

{
  "role": "user|assistant|system",
  "content": "Message content",
  "name": "optional-name",
  "tool_calls": [],
  "tool_call_id": "optional-id"
}

Tool Call Format

{
  "type": "function",
  "id": "call_abc123",
  "function": {
    "name": "execute_code",
    "description": "Execute code in a sandboxed environment",
    "parameters": {
      "type": "object",
      "properties": {
        "code": {
          "type": "string",
          "description": "The code to execute"
        },
        "language": {
          "type": "string",
          "description": "Programming language"
        }
      },
      "required": ["code", "language"]
    }
  }
}

Tool Call Response Format

{
  "tool_call_id": "call_abc123",
  "output": "Hello, World!",
  "error": null,
  "execution_time_ms": 150
}

Vision Support (Image Input)

{
  "role": "user",
  "content": [
    {
      "type": "text",
      "text": "What does this code do?"
    },
    {
      "type": "image_url",
      "image_url": {
        "url": "https://example.com/screenshot.png",
        "detail": "low"
      }
    }
  ]
}

Error Handling

Error Response Format

{
  "error": {
    "message": "Invalid API key",
    "type": "invalid_request_error",
    "param": "authorization",
    "code": 401
  }
}

Error Codes

Code Type HTTP Status Description
invalid_api_key Authentication 401 API key is invalid
account_disabled Authentication 403 Account disabled
rate_limit_exceeded Rate Limit 429 Too many requests
context_length_exceeded Invalid Request 400 Context too long
invalid_request Invalid Request 400 Malformed request
model_not_found Invalid Request 404 Model doesn't exist
tool_error Tool Error 422 Tool execution failed
internal_error Server Error 500 Server-side error
service_unavailable Server Error 503 Service temporarily down

Handling Errors in Code

import openai
from openai import APIError, RateLimitError

client = openai.OpenAI(api_key="your-api-key")

try:
    response = client.chat.completions.create(
        model="qwen/qwen2.5-coder-32b",
        messages=[{"role": "user", "content": "Hello"}]
    )
except RateLimitError:
    print("Rate limit exceeded. Please wait.")
    # Implement backoff logic
except APIError as e:
    print(f"API error: {e}")
    # Handle error appropriately
try {
  const response = await client.chat.completions.create({
    model: 'qwen/qwen2.5-coder-32b',
    messages: [{role: 'user', content: 'Hello'}]
  });
} catch (error) {
  if (error.status === 429) {
    console.log('Rate limit exceeded');
  } else if (error.status === 401) {
    console.log('Invalid API key');
  } else {
    console.error('API error:', error.message);
  }
}

SDKs and Examples

Python SDK

pip install openai
from openai import OpenAI

client = OpenAI(
    api_key="your-api-key",
    base_url="https://api.stack2.9.openclaw.org/v1"
)

# Non-streaming
response = client.chat.completions.create(
    model="qwen/qwen2.5-coder-32b",
    messages=[
        {"role": "system", "content": "You are a coding assistant."},
        {"role": "user", "content": "Write a Python class for a stack data structure."}
    ],
    temperature=0.7,
    max_tokens=1000
)

print(response.choices[0].message.content)

# Streaming
stream = client.chat.completions.create(
    model="qwen/qwen2.5-coder-32b",
    messages=[{"role": "user", "content": "Explain recursion"}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)

JavaScript/Node.js SDK

npm install openai
import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: 'your-api-key',
  baseURL: 'https://api.stack2.9.openclaw.org/v1'
});

// Non-streaming
const response = await client.chat.completions.create({
  model: 'qwen/qwen2.5-coder-32b',
  messages: [
    {role: 'system', content: 'You are a coding assistant.'},
    {role: 'user', content: 'Write a Python class for a stack data structure.'}
  ],
  temperature: 0.7,
  max_tokens: 1000
});

console.log(response.choices[0].message.content);

// Streaming
const stream = await client.chat.completions.create({
  model: 'qwen/qwen2.5-coder-32b',
  messages: [{role: 'user', content: 'Explain recursion'}],
  stream: true
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0].delta.content || '');
}

cURL Examples

# Basic chat completion
curl -X POST https://api.stack2.9.openclaw.org/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen/qwen2.5-coder-32b",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

# Streaming completion
curl -X POST https://api.stack2.9.openclaw.org/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "qwen/qwen2.5-coder-32b", "messages": [{"role": "user", "content": "Hello"}], "stream": true}'

# With system prompt
curl -X POST https://api.stack2.9.openclaw.org/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen/qwen2.5-coder-32b",
    "messages": [
      {"role": "system", "content": "You are an expert Python programmer."},
      {"role": "user", "content": "Write a decorator that caches function results."}
    ]
  }'

# With tools
curl -X POST https://api.stack2.9.openclaw.org/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen/qwen2.5-coder-32b",
    "messages": [{"role": "user", "content": "Create a file called hello.py"}],
    "tools": [
      {
        "type": "function",
        "function": {
          "name": "write_file",
          "description": "Write content to a file",
          "parameters": {
            "type": "object",
            "properties": {
              "path": {"type": "string"},
              "content": {"type": "string"}
            }
          }
        }
      }
    ]
  }'

OpenAI-Compatible Client Usage

Stack 2.9 is compatible with the OpenAI client library:

# Works with LangChain
from langchain.chat_models import ChatOpenAI
from langchain.schema import HumanMessage

chat = ChatOpenAI(
    model="qwen/qwen2.5-coder-32b",
    openai_api_base="https://api.stack2.9.openclaw.org/v1",
    openai_api_key="your-api-key"
)

response = chat([HumanMessage(content="Hello!")])

Webhooks

Configure webhooks for asynchronous events:

{
  "webhook_url": "https://your-server.com/webhook",
  "events": [
    "tool_call.started",
    "tool_call.completed",
    "tool_call.failed",
    "generation.done"
  ]
}

Webhook Payload

{
  "event": "tool_call.completed",
  "timestamp": "2026-04-01T12:00:00Z",
  "data": {
    "tool_call_id": "call_abc123",
    "tool_name": "execute_code",
    "execution_time_ms": 150,
    "result": "Hello, World!"
  }
}

Best Practices

1. Use Streaming for Better UX

Always use streaming for long-form content to provide real-time feedback to users.

2. Implement Proper Error Handling

Always handle rate limits and authentication errors gracefully with exponential backoff.

3. Cache Responses

Cache frequent queries to reduce API calls and improve response times.

from functools import lru_cache

@lru_cache(maxsize=100)
def cached_completion(prompt_hash):
    # Implement caching logic
    pass

4. Use Appropriate Temperature

Task Temperature
Code generation 0.0 - 0.3
Factual Q&A 0.0 - 0.2
Creative writing 0.7 - 1.0
brainstorming 0.8 - 1.2

5. Monitor Token Usage

Track usage to stay within rate limits:

response = client.chat.completions.create(...)
print(f"Tokens used: {response.usage.total_tokens}")

Support


API Version: 1.0
Last Updated: 2026-04-01
Status: Active