z.ai provider + MCP proxy (implemented)
This document describes the z.ai integration that is implemented on the feat/zai-passthrough-mcp branch: what was added, how it works internally, and how to validate it.
Related deep dives:
docs/zai/provider.mddocs/zai/mcp.mddocs/zai/vision-mcp.mddocs/proxy/auth.mddocs/proxy/accounts.md
Scope (current)
- z.ai is integrated as an optional upstream for Anthropic/Claude protocol only (
/v1/messages,/v1/messages/count_tokens). - OpenAI and Gemini protocol handlers are unchanged and continue to use the existing Google-backed pool.
- z.ai MCP (Search + Reader) is exposed via local proxy endpoints (reverse proxy) and injects the z.ai API key upstream.
- Vision MCP is exposed via a built-in MCP server (local endpoint) and uses the stored z.ai API key to call the z.ai vision API.
Configuration
All settings are persisted in the existing data directory (same place as Google accounts and gui_config.json).
Proxy auth
proxy.auth_mode(off|strict|all_except_health|auto)off: no auth requiredstrict: auth required for all routesall_except_health: auth required for all routes exceptGET /healthzauto: ifallow_lan_access=true->all_except_health, elseoff
proxy.api_key: required when auth is enabled
Implementation:
- Backend enum:
src-tauri/src/proxy/config.rs(ProxyAuthMode) - Effective policy resolver:
src-tauri/src/proxy/security.rs - Middleware enforcement:
src-tauri/src/proxy/middleware/auth.rs
z.ai provider
Config lives under proxy.zai (src-tauri/src/proxy/config.rs):
enabled: boolbase_url: string(defaulthttps://api.z.ai/api/anthropic)api_key: stringdispatch_mode: off | exclusive | pooled | fallbackoff: never use z.aiexclusive: all Claude protocol requests go to z.aipooled: z.ai is treated as one additional slot in the shared pool (no priority, no strict guarantee)fallback: z.ai is used only when the Google pool has 0 accounts
models: defaults used when the incoming Anthropic request usesclaude-*model idsopusdefaultglm-4.7sonnetdefaultglm-4.7haikudefaultglm-4.5-air
model_mapping: optional exact-match overrides ({ "<incoming_model>": "<glm-model-id>" })- When a key matches the incoming
modelstring, it is replaced with the mapped z.ai model id before forwarding upstream.
- When a key matches the incoming
mcptoggles:enabledweb_search_enabledweb_reader_enabledvision_enabled
Runtime hot update:
save_confighot-updatesauth,upstream_proxy,model mappings, andz.aiwithout restart.src-tauri/src/commands/mod.rscallsaxum_server.update_security(...)andaxum_server.update_zai(...).
Request routing
/v1/messages (Anthropic messages)
Handler: src-tauri/src/proxy/handlers/claude.rs (handle_messages)
Flow:
- The handler receives
HeaderMap+ raw JSONValue. - It decides whether to use z.ai or the existing Google flow:
- If z.ai is disabled -> use Google flow.
- If
dispatch_mode=exclusive-> use z.ai. - If
dispatch_mode=fallback-> use z.ai only if Google pool size is 0. - If
dispatch_mode=pooled-> use round-robin across(google_accounts + 1)slots; slot0is z.ai, others are Google.
- If z.ai is selected:
- The raw JSON is forwarded to z.ai as-is (streaming is supported by byte passthrough).
- The request
modelmay be rewritten:- if
proxy.zai.model_mappingcontains an exact match, that mapping wins glm-*stays unchangedclaude-*becomes one ofproxy.zai.models.{opus,sonnet,haiku}based on name match
- if
- Otherwise:
- The existing Claude→Gemini transform and Google-backed execution path runs as before.
/v1/messages/count_tokens
Handler: src-tauri/src/proxy/handlers/claude.rs (handle_count_tokens)
- If z.ai is enabled (mode != off), this request is forwarded to z.ai.
- Otherwise it returns the existing placeholder
{input_tokens: 0, output_tokens: 0}.
Upstream forwarding details (z.ai Anthropic)
Provider: src-tauri/src/proxy/providers/zai_anthropic.rs
Security / header handling:
- The local proxy API key must never be forwarded upstream.
- Only a conservative set of incoming headers is forwarded (e.g.
content-type,accept,anthropic-version,user-agent). - z.ai auth is injected:
- If the client used
x-api-key, it is replaced with z.ai key. - If the client used
Authorization, it is replaced withBearer <zai_key>. - If neither is present,
x-api-key: <zai_key>is used.
- If the client used
- Responses are streamed back to the client without parsing SSE.
Networking:
- Respects the global upstream proxy config (
proxy.upstream_proxy) for outbound HTTP calls.
MCP reverse proxy (Search + Reader)
Handlers: src-tauri/src/proxy/handlers/mcp.rs
Routes: src-tauri/src/proxy/server.rs
Local endpoints:
/mcp/web_search_prime/mcp→https://api.z.ai/api/mcp/web_search_prime/mcp/mcp/web_reader/mcp→https://api.z.ai/api/mcp/web_reader/mcp
Behavior:
- Controlled by
proxy.zai.mcp.*flags:- If
mcp.enabled=false-> endpoints return 404. - If per-server flag is false -> returns 404 for that endpoint.
- If
- z.ai key is injected upstream as
Authorization: Bearer <zai_key>. - Response body is streamed back to the client.
Note:
- These endpoints are still subject to the proxy’s auth middleware depending on
proxy.auth_mode.
Vision MCP (built-in server)
Handlers:
src-tauri/src/proxy/handlers/mcp.rs(handle_zai_mcp_server)src-tauri/src/proxy/zai_vision_tools.rs(tool registry + z.ai vision API client)
Local endpoint:
/mcp/zai-mcp-server/mcp
Behavior:
- Controlled by
proxy.zai.mcp.enabledandproxy.zai.mcp.vision_enabled.- If
mcp.enabled=false-> returns 404. - If
vision_enabled=false-> returns 404.
- If
- No z.ai key is required from MCP clients:
- the proxy injects the stored
proxy.zai.api_keywhen calling the z.ai vision API.
- the proxy injects the stored
- Implements a minimal Streamable HTTP MCP flow:
POST /mcpsupportsinitialize,tools/list,tools/callGET /mcpreturns an SSE stream with keep-alive events for an initialized sessionDELETE /mcpterminates a session
Upstream calls:
- z.ai vision endpoint:
https://api.z.ai/api/paas/v4/chat/completions - Uses
Authorization: Bearer <zai_key> - Default model:
glm-4.6v(hardcoded for now)
Tool input and limits:
- Images:
.png,.jpg,.jpegup to 5 MB (local files are encoded asdata:<mime>;base64,...). - Videos:
.mp4,.mov,.m4vup to 8 MB. - Supported tools:
ui_to_artifactextract_text_from_screenshotdiagnose_error_screenshotunderstand_technical_diagramanalyze_data_visualizationui_diff_checkanalyze_imageanalyze_video
UI
Page: src/pages/ApiProxy.tsx
Added controls:
- Authorization toggle + mode selector (
off/strict/all_except_health/auto) - z.ai block:
- enable toggle
- base_url
- dispatch mode
- api key input (stored locally)
- model mapping UI:
- fetch available model ids from the z.ai upstream (
GET <base_url>/v1/models) - configure default
opus/sonnet/haikumapping - configure optional exact-match overrides
- fetch available model ids from the z.ai upstream (
- MCP toggles + display of local MCP endpoints
Translations:
src/locales/en.jsonsrc/locales/zh.json
Validation checklist
Build:
- Frontend:
npm run build - Backend:
cd src-tauri && cargo build
Manual (example):
- Enable proxy auth (strict or all-except-health) and note
proxy.api_key. - Enable z.ai and set:
dispatch_mode=exclusiveapi_key=<your_z.ai.key>
- Start proxy and call:
GET http://127.0.0.1:<port>/healthz(should work without auth in all-except-health; always works in off)POST http://127.0.0.1:<port>/v1/messageswithAuthorization: Bearer <proxy.api_key>and a normal Anthropic request body.
- Enable MCP Search and call local
/mcp/web_search_prime/mcpvia an MCP client (the proxy injects z.ai auth upstream). - Enable Vision MCP and verify the tool list:
POST http://127.0.0.1:<port>/mcp/zai-mcp-server/mcpwith a JSON-RPCinitialize- then
POST ...withtools/listusing the returnedMcp-Session-Idheader.
Known limitations / follow-ups
- Vision MCP currently implements the core methods needed for tool calls but is not yet a full feature-complete MCP server (prompts/resources, resumability, streaming tool output).
- z.ai usage/budget (monitor endpoints) is not implemented yet.
- Claude model list endpoint remains a static stub (
/v1/models/claude) and is not yet provider-aware.