File size: 9,019 Bytes
852e525 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 | # z.ai provider + MCP proxy (implemented)
This document describes the z.ai integration that is implemented on the `feat/zai-passthrough-mcp` branch: what was added, how it works internally, and how to validate it.
Related deep dives:
- [`docs/zai/provider.md`](provider.md)
- [`docs/zai/mcp.md`](mcp.md)
- [`docs/zai/vision-mcp.md`](vision-mcp.md)
- [`docs/proxy/auth.md`](../proxy/auth.md)
- [`docs/proxy/accounts.md`](../proxy/accounts.md)
## Scope (current)
- z.ai is integrated as an **optional upstream** for **Anthropic/Claude protocol only** (`/v1/messages`, `/v1/messages/count_tokens`).
- OpenAI and Gemini protocol handlers are unchanged and continue to use the existing Google-backed pool.
- z.ai MCP (Search + Reader) is exposed via local proxy endpoints (reverse proxy) and injects the z.ai API key upstream.
- Vision MCP is exposed via a **built-in MCP server** (local endpoint) and uses the stored z.ai API key to call the z.ai vision API.
## Configuration
All settings are persisted in the existing data directory (same place as Google accounts and `gui_config.json`).
### Proxy auth
- `proxy.auth_mode` (`off` | `strict` | `all_except_health` | `auto`)
- `off`: no auth required
- `strict`: auth required for all routes
- `all_except_health`: auth required for all routes except `GET /healthz`
- `auto`: if `allow_lan_access=true` -> `all_except_health`, else `off`
- `proxy.api_key`: required when auth is enabled
Implementation:
- Backend enum: [`src-tauri/src/proxy/config.rs`](../../src-tauri/src/proxy/config.rs) (`ProxyAuthMode`)
- Effective policy resolver: [`src-tauri/src/proxy/security.rs`](../../src-tauri/src/proxy/security.rs)
- Middleware enforcement: [`src-tauri/src/proxy/middleware/auth.rs`](../../src-tauri/src/proxy/middleware/auth.rs)
### z.ai provider
Config lives under `proxy.zai` (`src-tauri/src/proxy/config.rs`):
- `enabled: bool`
- `base_url: string` (default `https://api.z.ai/api/anthropic`)
- `api_key: string`
- `dispatch_mode: off | exclusive | pooled | fallback`
- `off`: never use z.ai
- `exclusive`: all Claude protocol requests go to z.ai
- `pooled`: z.ai is treated as **one additional slot** in the shared pool (no priority, no strict guarantee)
- `fallback`: z.ai is used only when the Google pool has 0 accounts
- `models`: defaults used when the incoming Anthropic request uses `claude-*` model ids
- `opus` default `glm-4.7`
- `sonnet` default `glm-4.7`
- `haiku` default `glm-4.5-air`
- `model_mapping`: optional exact-match overrides (`{ "<incoming_model>": "<glm-model-id>" }`)
- When a key matches the incoming `model` string, it is replaced with the mapped z.ai model id before forwarding upstream.
- `mcp` toggles:
- `enabled`
- `web_search_enabled`
- `web_reader_enabled`
- `vision_enabled`
Runtime hot update:
- `save_config` hot-updates `auth`, `upstream_proxy`, `model mappings`, and `z.ai` without restart.
- `src-tauri/src/commands/mod.rs` calls `axum_server.update_security(...)` and `axum_server.update_zai(...)`.
## Request routing
### `/v1/messages` (Anthropic messages)
Handler: `src-tauri/src/proxy/handlers/claude.rs` (`handle_messages`)
Flow:
1. The handler receives `HeaderMap` + raw JSON `Value`.
2. It decides whether to use z.ai or the existing Google flow:
- If z.ai is disabled -> use Google flow.
- If `dispatch_mode=exclusive` -> use z.ai.
- If `dispatch_mode=fallback` -> use z.ai only if Google pool size is 0.
- If `dispatch_mode=pooled` -> use round-robin across `(google_accounts + 1)` slots; slot `0` is z.ai, others are Google.
3. If z.ai is selected:
- The raw JSON is forwarded to z.ai as-is (streaming is supported by byte passthrough).
- The request `model` may be rewritten:
- if `proxy.zai.model_mapping` contains an exact match, that mapping wins
- `glm-*` stays unchanged
- `claude-*` becomes one of `proxy.zai.models.{opus,sonnet,haiku}` based on name match
4. Otherwise:
- The existing Claude→Gemini transform and Google-backed execution path runs as before.
### `/v1/messages/count_tokens`
Handler: `src-tauri/src/proxy/handlers/claude.rs` (`handle_count_tokens`)
- If z.ai is enabled (mode != off), this request is forwarded to z.ai.
- Otherwise it returns the existing placeholder `{input_tokens: 0, output_tokens: 0}`.
## Upstream forwarding details (z.ai Anthropic)
Provider: `src-tauri/src/proxy/providers/zai_anthropic.rs`
Security / header handling:
- The local proxy API key must **never** be forwarded upstream.
- Only a conservative set of incoming headers is forwarded (e.g. `content-type`, `accept`, `anthropic-version`, `user-agent`).
- z.ai auth is injected:
- If the client used `x-api-key`, it is replaced with z.ai key.
- If the client used `Authorization`, it is replaced with `Bearer <zai_key>`.
- If neither is present, `x-api-key: <zai_key>` is used.
- Responses are streamed back to the client without parsing SSE.
Networking:
- Respects the global upstream proxy config (`proxy.upstream_proxy`) for outbound HTTP calls.
## MCP reverse proxy (Search + Reader)
Handlers: `src-tauri/src/proxy/handlers/mcp.rs`
Routes: `src-tauri/src/proxy/server.rs`
Local endpoints:
- `/mcp/web_search_prime/mcp` → `https://api.z.ai/api/mcp/web_search_prime/mcp`
- `/mcp/web_reader/mcp` → `https://api.z.ai/api/mcp/web_reader/mcp`
Behavior:
- Controlled by `proxy.zai.mcp.*` flags:
- If `mcp.enabled=false` -> endpoints return 404.
- If per-server flag is false -> returns 404 for that endpoint.
- z.ai key is injected upstream as `Authorization: Bearer <zai_key>`.
- Response body is streamed back to the client.
Note:
- These endpoints are still subject to the proxy’s auth middleware depending on `proxy.auth_mode`.
## Vision MCP (built-in server)
Handlers:
- [`src-tauri/src/proxy/handlers/mcp.rs`](../../src-tauri/src/proxy/handlers/mcp.rs) (`handle_zai_mcp_server`)
- [`src-tauri/src/proxy/zai_vision_tools.rs`](../../src-tauri/src/proxy/zai_vision_tools.rs) (tool registry + z.ai vision API client)
Local endpoint:
- `/mcp/zai-mcp-server/mcp`
Behavior:
- Controlled by `proxy.zai.mcp.enabled` and `proxy.zai.mcp.vision_enabled`.
- If `mcp.enabled=false` -> returns 404.
- If `vision_enabled=false` -> returns 404.
- No z.ai key is required from MCP clients:
- the proxy injects the stored `proxy.zai.api_key` when calling the z.ai vision API.
- Implements a minimal Streamable HTTP MCP flow:
- `POST /mcp` supports `initialize`, `tools/list`, `tools/call`
- `GET /mcp` returns an SSE stream with keep-alive events for an initialized session
- `DELETE /mcp` terminates a session
Upstream calls:
- z.ai vision endpoint: `https://api.z.ai/api/paas/v4/chat/completions`
- Uses `Authorization: Bearer <zai_key>`
- Default model: `glm-4.6v` (hardcoded for now)
Tool input and limits:
- Images: `.png`, `.jpg`, `.jpeg` up to 5 MB (local files are encoded as `data:<mime>;base64,...`).
- Videos: `.mp4`, `.mov`, `.m4v` up to 8 MB.
- Supported tools:
- `ui_to_artifact`
- `extract_text_from_screenshot`
- `diagnose_error_screenshot`
- `understand_technical_diagram`
- `analyze_data_visualization`
- `ui_diff_check`
- `analyze_image`
- `analyze_video`
## UI
Page: `src/pages/ApiProxy.tsx`
Added controls:
- Authorization toggle + mode selector (`off/strict/all_except_health/auto`)
- z.ai block:
- enable toggle
- base_url
- dispatch mode
- api key input (stored locally)
- model mapping UI:
- fetch available model ids from the z.ai upstream (`GET <base_url>/v1/models`)
- configure default `opus/sonnet/haiku` mapping
- configure optional exact-match overrides
- MCP toggles + display of local MCP endpoints
Translations:
- `src/locales/en.json`
- `src/locales/zh.json`
## Validation checklist
Build:
- Frontend: `npm run build`
- Backend: `cd src-tauri && cargo build`
Manual (example):
1) Enable proxy auth (strict or all-except-health) and note `proxy.api_key`.
2) Enable z.ai and set:
- `dispatch_mode=exclusive`
- `api_key=<your_z.ai.key>`
3) Start proxy and call:
- `GET http://127.0.0.1:<port>/healthz` (should work without auth in all-except-health; always works in off)
- `POST http://127.0.0.1:<port>/v1/messages` with `Authorization: Bearer <proxy.api_key>` and a normal Anthropic request body.
4) Enable MCP Search and call local `/mcp/web_search_prime/mcp` via an MCP client (the proxy injects z.ai auth upstream).
5) Enable Vision MCP and verify the tool list:
- `POST http://127.0.0.1:<port>/mcp/zai-mcp-server/mcp` with a JSON-RPC `initialize`
- then `POST ...` with `tools/list` using the returned `Mcp-Session-Id` header.
## Known limitations / follow-ups
- Vision MCP currently implements the core methods needed for tool calls but is not yet a full feature-complete MCP server (prompts/resources, resumability, streaming tool output).
- z.ai usage/budget (monitor endpoints) is not implemented yet.
- Claude model list endpoint remains a static stub (`/v1/models/claude`) and is not yet provider-aware.
|