File size: 9,019 Bytes
852e525
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
# z.ai provider + MCP proxy (implemented)

This document describes the z.ai integration that is implemented on the `feat/zai-passthrough-mcp` branch: what was added, how it works internally, and how to validate it.

Related deep dives:
- [`docs/zai/provider.md`](provider.md)
- [`docs/zai/mcp.md`](mcp.md)
- [`docs/zai/vision-mcp.md`](vision-mcp.md)
- [`docs/proxy/auth.md`](../proxy/auth.md)
- [`docs/proxy/accounts.md`](../proxy/accounts.md)

## Scope (current)
- z.ai is integrated as an **optional upstream** for **Anthropic/Claude protocol only** (`/v1/messages`, `/v1/messages/count_tokens`).
- OpenAI and Gemini protocol handlers are unchanged and continue to use the existing Google-backed pool.
- z.ai MCP (Search + Reader) is exposed via local proxy endpoints (reverse proxy) and injects the z.ai API key upstream.
- Vision MCP is exposed via a **built-in MCP server** (local endpoint) and uses the stored z.ai API key to call the z.ai vision API.

## Configuration
All settings are persisted in the existing data directory (same place as Google accounts and `gui_config.json`).

### Proxy auth
- `proxy.auth_mode` (`off` | `strict` | `all_except_health` | `auto`)
  - `off`: no auth required
  - `strict`: auth required for all routes
  - `all_except_health`: auth required for all routes except `GET /healthz`
  - `auto`: if `allow_lan_access=true` -> `all_except_health`, else `off`
- `proxy.api_key`: required when auth is enabled

Implementation:
- Backend enum: [`src-tauri/src/proxy/config.rs`](../../src-tauri/src/proxy/config.rs) (`ProxyAuthMode`)
- Effective policy resolver: [`src-tauri/src/proxy/security.rs`](../../src-tauri/src/proxy/security.rs)
- Middleware enforcement: [`src-tauri/src/proxy/middleware/auth.rs`](../../src-tauri/src/proxy/middleware/auth.rs)

### z.ai provider
Config lives under `proxy.zai` (`src-tauri/src/proxy/config.rs`):
- `enabled: bool`
- `base_url: string` (default `https://api.z.ai/api/anthropic`)
- `api_key: string`
- `dispatch_mode: off | exclusive | pooled | fallback`
  - `off`: never use z.ai
  - `exclusive`: all Claude protocol requests go to z.ai
  - `pooled`: z.ai is treated as **one additional slot** in the shared pool (no priority, no strict guarantee)
  - `fallback`: z.ai is used only when the Google pool has 0 accounts
- `models`: defaults used when the incoming Anthropic request uses `claude-*` model ids
  - `opus` default `glm-4.7`
  - `sonnet` default `glm-4.7`
  - `haiku` default `glm-4.5-air`
- `model_mapping`: optional exact-match overrides (`{ "<incoming_model>": "<glm-model-id>" }`)
  - When a key matches the incoming `model` string, it is replaced with the mapped z.ai model id before forwarding upstream.
- `mcp` toggles:
  - `enabled`
  - `web_search_enabled`
  - `web_reader_enabled`
  - `vision_enabled`

Runtime hot update:
- `save_config` hot-updates `auth`, `upstream_proxy`, `model mappings`, and `z.ai` without restart.
  - `src-tauri/src/commands/mod.rs` calls `axum_server.update_security(...)` and `axum_server.update_zai(...)`.

## Request routing

### `/v1/messages` (Anthropic messages)
Handler: `src-tauri/src/proxy/handlers/claude.rs` (`handle_messages`)

Flow:
1. The handler receives `HeaderMap` + raw JSON `Value`.
2. It decides whether to use z.ai or the existing Google flow:
   - If z.ai is disabled -> use Google flow.
   - If `dispatch_mode=exclusive` -> use z.ai.
   - If `dispatch_mode=fallback` -> use z.ai only if Google pool size is 0.
   - If `dispatch_mode=pooled` -> use round-robin across `(google_accounts + 1)` slots; slot `0` is z.ai, others are Google.
3. If z.ai is selected:
   - The raw JSON is forwarded to z.ai as-is (streaming is supported by byte passthrough).
   - The request `model` may be rewritten:
     - if `proxy.zai.model_mapping` contains an exact match, that mapping wins
     - `glm-*` stays unchanged
     - `claude-*` becomes one of `proxy.zai.models.{opus,sonnet,haiku}` based on name match
4. Otherwise:
   - The existing Claude→Gemini transform and Google-backed execution path runs as before.

### `/v1/messages/count_tokens`
Handler: `src-tauri/src/proxy/handlers/claude.rs` (`handle_count_tokens`)
- If z.ai is enabled (mode != off), this request is forwarded to z.ai.
- Otherwise it returns the existing placeholder `{input_tokens: 0, output_tokens: 0}`.

## Upstream forwarding details (z.ai Anthropic)
Provider: `src-tauri/src/proxy/providers/zai_anthropic.rs`

Security / header handling:
- The local proxy API key must **never** be forwarded upstream.
- Only a conservative set of incoming headers is forwarded (e.g. `content-type`, `accept`, `anthropic-version`, `user-agent`).
- z.ai auth is injected:
  - If the client used `x-api-key`, it is replaced with z.ai key.
  - If the client used `Authorization`, it is replaced with `Bearer <zai_key>`.
  - If neither is present, `x-api-key: <zai_key>` is used.
- Responses are streamed back to the client without parsing SSE.

Networking:
- Respects the global upstream proxy config (`proxy.upstream_proxy`) for outbound HTTP calls.

## MCP reverse proxy (Search + Reader)
Handlers: `src-tauri/src/proxy/handlers/mcp.rs`
Routes: `src-tauri/src/proxy/server.rs`

Local endpoints:
- `/mcp/web_search_prime/mcp``https://api.z.ai/api/mcp/web_search_prime/mcp`
- `/mcp/web_reader/mcp``https://api.z.ai/api/mcp/web_reader/mcp`

Behavior:
- Controlled by `proxy.zai.mcp.*` flags:
  - If `mcp.enabled=false` -> endpoints return 404.
  - If per-server flag is false -> returns 404 for that endpoint.
- z.ai key is injected upstream as `Authorization: Bearer <zai_key>`.
- Response body is streamed back to the client.

Note:
- These endpoints are still subject to the proxy’s auth middleware depending on `proxy.auth_mode`.

## Vision MCP (built-in server)
Handlers:
- [`src-tauri/src/proxy/handlers/mcp.rs`](../../src-tauri/src/proxy/handlers/mcp.rs) (`handle_zai_mcp_server`)
- [`src-tauri/src/proxy/zai_vision_tools.rs`](../../src-tauri/src/proxy/zai_vision_tools.rs) (tool registry + z.ai vision API client)

Local endpoint:
- `/mcp/zai-mcp-server/mcp`

Behavior:
- Controlled by `proxy.zai.mcp.enabled` and `proxy.zai.mcp.vision_enabled`.
  - If `mcp.enabled=false` -> returns 404.
  - If `vision_enabled=false` -> returns 404.
- No z.ai key is required from MCP clients:
  - the proxy injects the stored `proxy.zai.api_key` when calling the z.ai vision API.
- Implements a minimal Streamable HTTP MCP flow:
  - `POST /mcp` supports `initialize`, `tools/list`, `tools/call`
  - `GET /mcp` returns an SSE stream with keep-alive events for an initialized session
  - `DELETE /mcp` terminates a session

Upstream calls:
- z.ai vision endpoint: `https://api.z.ai/api/paas/v4/chat/completions`
- Uses `Authorization: Bearer <zai_key>`
- Default model: `glm-4.6v` (hardcoded for now)

Tool input and limits:
- Images: `.png`, `.jpg`, `.jpeg` up to 5 MB (local files are encoded as `data:<mime>;base64,...`).
- Videos: `.mp4`, `.mov`, `.m4v` up to 8 MB.
- Supported tools:
  - `ui_to_artifact`
  - `extract_text_from_screenshot`
  - `diagnose_error_screenshot`
  - `understand_technical_diagram`
  - `analyze_data_visualization`
  - `ui_diff_check`
  - `analyze_image`
  - `analyze_video`

## UI
Page: `src/pages/ApiProxy.tsx`

Added controls:
- Authorization toggle + mode selector (`off/strict/all_except_health/auto`)
- z.ai block:
  - enable toggle
  - base_url
  - dispatch mode
  - api key input (stored locally)
  - model mapping UI:
    - fetch available model ids from the z.ai upstream (`GET <base_url>/v1/models`)
    - configure default `opus/sonnet/haiku` mapping
    - configure optional exact-match overrides
  - MCP toggles + display of local MCP endpoints

Translations:
- `src/locales/en.json`
- `src/locales/zh.json`

## Validation checklist
Build:
- Frontend: `npm run build`
- Backend: `cd src-tauri && cargo build`

Manual (example):
1) Enable proxy auth (strict or all-except-health) and note `proxy.api_key`.
2) Enable z.ai and set:
   - `dispatch_mode=exclusive`
   - `api_key=<your_z.ai.key>`
3) Start proxy and call:
   - `GET http://127.0.0.1:<port>/healthz` (should work without auth in all-except-health; always works in off)
   - `POST http://127.0.0.1:<port>/v1/messages` with `Authorization: Bearer <proxy.api_key>` and a normal Anthropic request body.
4) Enable MCP Search and call local `/mcp/web_search_prime/mcp` via an MCP client (the proxy injects z.ai auth upstream).
5) Enable Vision MCP and verify the tool list:
   - `POST http://127.0.0.1:<port>/mcp/zai-mcp-server/mcp` with a JSON-RPC `initialize`
   - then `POST ...` with `tools/list` using the returned `Mcp-Session-Id` header.

## Known limitations / follow-ups
- Vision MCP currently implements the core methods needed for tool calls but is not yet a full feature-complete MCP server (prompts/resources, resumability, streaming tool output).
- z.ai usage/budget (monitor endpoints) is not implemented yet.
- Claude model list endpoint remains a static stub (`/v1/models/claude`) and is not yet provider-aware.