n2r-dev

Running

App Files Files Community

cacodex commited on 5 days ago

Commit

b0f7d3f

verified ·

1 Parent(s): 9c1527e

Upload 10 files

Browse files

Files changed (7) hide show

.env.example +7 -10
README.md +92 -135
app/main.py +507 -933
requirements.txt +2 -5
static/index.html +78 -42
static/public.js +191 -99
static/style.css +345 -551

.env.example CHANGED Viewed

@@ -1,11 +1,8 @@
-PASSWORD=change-me
-SESSION_SECRET=change-me-too
-PASS_APIKEY=change-me-api-key
-NVIDIA_API_BASE=https://integrate.api.nvidia.com/v1
-NVIDIA_NIM_API_KEY=
-HEALTHCHECK_INTERVAL_MINUTES=60
-HEALTHCHECK_PROMPT=请只回复 OK。
-PUBLIC_HISTORY_HOURS=48
-MAX_UPSTREAM_CONNECTIONS=256
-MAX_KEEPALIVE_CONNECTIONS=64
 DATABASE_PATH=./data.sqlite3

+NVIDIA_API_BASE=https://integrate.api.nvidia.com/v1
+MODEL_LIST=z-ai/glm5,z-ai/glm4.7,minimaxai/minimax-m2.5,minimaxai/minimax-m2.7,moonshotai/kimi-k2.5,deepseek-ai/deepseek-v3.2,google/gemma-4-31b-it,qwen/qwen3.5-397b-a17b
+MODEL_SYNC_INTERVAL_MINUTES=30
+PUBLIC_HISTORY_BUCKETS=6
+REQUEST_TIMEOUT_SECONDS=90
+MAX_UPSTREAM_CONNECTIONS=512
+MAX_KEEPALIVE_CONNECTIONS=128
 DATABASE_PATH=./data.sqlite3

README.md CHANGED Viewed

@@ -1,136 +1,93 @@
----
-title: NVIDIA NIM 响应网关
-sdk: docker
-app_port: 7860
-pinned: false
----
-# NVIDIA NIM 响应网关
-这是一个基于 FastAPI 的兼容层项目，用来把 NVIDIA 官方接口：
-`https://integrate.api.nvidia.com/v1/chat/completions`
-转换为 OpenAI 风格的 `/v1/responses` 接口，并附带一个公开健康看板和一个中文后台管理系统。
-## 已支持能力
-- `POST /v1/responses`
-- `GET /v1/models`
-- `GET /v1/responses/{response_id}`
-- tool calling / function calling 转换
-- `function_call_output` 回灌转换
-- `previous_response_id` 对话续写
-- `PASS_APIKEY` 鉴权保护 `/v1/responses`
-- 多个 NVIDIA NIM Key 轮询分发
-- 共享 HTTP 连接池，支持高并发转发
-- 模型管理
-- NVIDIA NIM Key 管理
-- 后台一键测试全部模型
-- 按小时健康巡检与公开状态页展示
-- Docker 方式部署到 Hugging Face Space
-## 预置模型
-首次启动会自动写入以下模型：
-- `z-ai/glm5`
-- `minimaxai/minimax-m2.5`
-- `moonshotai/kimi-k2.5`
-- `deepseek-ai/deepseek-v3.2`
-- `google/gemma-4-31b-it`
-- `qwen/qwen3.5-397b-a17b`
-你也可以在后台继续添加、删除和测试模型。
-## 页面与接口
-公开页面：
-- `GET /` 模型健康度看板
-- `GET /api/health/public` 公开健康数据
-兼容接口：
-- `POST /v1/responses`
-- `GET /v1/models`
-- `GET /v1/responses/{response_id}`
-后台页面：
-- `GET /admin`
-- `POST /admin/api/login`
-- `GET /admin/api/overview`
-- `GET/POST/DELETE /admin/api/models...`
-- `GET/POST/DELETE /admin/api/keys...`
-- `GET /admin/api/healthchecks`
-- `POST /admin/api/healthchecks/run`
-- `GET/PUT /admin/api/settings`
-## 环境变量
-- `PASSWORD`：后台登录密码，必填
-- `SESSION_SECRET`：后台会话签名密钥，可选；默认回退到 `PASSWORD`
-- `PASS_APIKEY`：外部调用 `/v1/responses` 时使用的鉴权密钥，支持 `Authorization: Bearer ...` 或 `X-API-Key`
-- `NVIDIA_API_BASE`：默认 `https://integrate.api.nvidia.com/v1`
-- `NVIDIA_NIM_API_KEY`：可选，首次启动时自动导入为默认 Key
-- `HEALTHCHECK_INTERVAL_MINUTES`：默认 `60`
-- `HEALTHCHECK_PROMPT`：默认 `请只回复 OK。`
-- `PUBLIC_HISTORY_HOURS`：默认 `48`
-- `MAX_UPSTREAM_CONNECTIONS`：默认 `256`
-- `MAX_KEEPALIVE_CONNECTIONS`：默认 `64`
-- `DATABASE_PATH`：默认 `./data.sqlite3`
-示例配置见 `.env.example`。
-## 本地运行
-安装运行依赖：
-```bash
-pip install -r requirements.txt
-```
-如需本地联调与 smoke test：
-```bash
-pip install -r requirements-dev.txt
-python scripts/local_smoke_test.py
-```
-启动服务：
-```bash
-uvicorn app.main:app --host 0.0.0.0 --port 7860
-```
-## 部署到 Hugging Face Space
-这个仓库已经按 Docker Space 准备好了部署文件。
-1. 新建一个 Hugging Face Space，SDK 选择 `Docker`
-2. 将 `hf_space` 目录内的内容作为 Space 根目录上传
-3. 在 Space Secrets 中至少配置 `PASSWORD`、`PASS_APIKEY` 和一个 NVIDIA NIM Key
-4. 打开 `/admin`，确认 Key 可用，并执行一次巡检
-## 本地验证情况
-我已经通过本地 smoke test 验证了以下链路：
-- 中文首页与中文后台页面可正常返回
-- HTML 响应头包含 `charset=utf-8`
-- `/v1/responses` 鉴权正常
-- `/v1/responses` 文本回复转换正常
-- tool call / function call 转换正常
-- `function_call_output` 回灌到上游消息格式正常
-- `previous_response_id` 上下文拼接正常
-- 多个 NIM Key 轮询分发正常
-- 并发请求转发正常
-- 后台登录、手动巡检、公开健康页同步正常
-## 参考资料
-- OpenAI Responses API: https://platform.openai.com/docs/guides/responses-vs-chat-completions
-- OpenAI Function Calling: https://platform.openai.com/docs/guides/function-calling
-- NVIDIA Build: https://build.nvidia.com/
 - NVIDIA NIM API 文档: https://docs.api.nvidia.com/

+---
+title: NVIDIA NIM 响应网关
+sdk: docker
+app_port: 7860
+pinned: false
+---
+# NVIDIA NIM 响应网关
+这是一个面向公开使用的 NVIDIA NIM 到 OpenAI `/v1/responses` 兼容网关。
+它不在本地保存任何用户的 NIM API Key。用户调用本项目时，需要自己通过请求头携带 NIM Key，网关只负责协议转换、性能优化、聚合统计和官方模型目录展示。
+## 主要能力
+- 将 NVIDIA 官方 `POST /v1/chat/completions` 转换为 OpenAI 风格的 `POST /v1/responses`
+- 支持 tool calling / function calling
+- 支持 `function_call_output` 回灌
+- 支持 `previous_response_id` 对话续写
+- 对 `/v1/responses` 和 `/v1/responses/{response_id}` 使用用户自带的 NIM Key 做鉴权与上游转发
+- `/v1/models` 直接返回来自 NVIDIA 官方 `/v1/models` 的同步结果，保持 OpenAI 风格结构
+- 前端第一页展示总调用次数、平均健康度、每个模型 10 分钟成功率
+- 前端第二页展示官方模型目录，并按提供商分类展示
+- 页面采用���向双页切换，带平滑动画与现代卡片式设计
+- 使用共享 HTTP 连接池、SQLite WAL 和异步线程化落库来增强高并发场景下的转发性能
+## 用户如何调用
+对于 `POST /v1/responses`，请通过下面任意一种方式传入你自己的 NVIDIA NIM Key：
+- `Authorization: Bearer <你的 NIM Key>`
+- `X-API-Key: <你的 NIM Key>`
+网关不会把原始 Key 持久化到数据库中，只会在内存中用于当前请求，并对响应链路使用 Key 哈希做隔离。
+## 官方模型目录同步
+项目会定时从官方接口拉取模型列表：
+`https://integrate.api.nvidia.com/v1/models`
+同步后的模型目录同时用于：
+- `GET /v1/models`
+- 前端第二页“官方模型库”展示
+## 公开页面
+- `GET /`：首页，双页展示
+- `GET /api/dashboard`：健康度与统计数据
+- `GET /api/catalog`：官方模型目录与提供商分类
+## 兼容接口
+- `POST /v1/responses`
+- `GET /v1/responses/{response_id}`
+- `GET /v1/models`
+## 环境变量
+- `NVIDIA_API_BASE`：默认 `https://integrate.api.nvidia.com/v1`
+- `MODEL_LIST`：首页监控模型列表，逗号分隔
+- `MODEL_SYNC_INTERVAL_MINUTES`：官方模型目录同步周期，默认 `30`
+- `PUBLIC_HISTORY_BUCKETS`：首页展示最近多少个 10 分钟时间片，默认 `6`
+- `REQUEST_TIMEOUT_SECONDS`：上游请求超时，默认 `90`
+- `MAX_UPSTREAM_CONNECTIONS`：共享连接池最大连接数，默认 `512`
+- `MAX_KEEPALIVE_CONNECTIONS`：共享连接池最大 keep-alive 连接数，默认 `128`
+- `DATABASE_PATH`：默认 `./data.sqlite3`
+## 本地验证
+我已经完成两层本地联调：
+1. Mock 联调：
+   - 通过 [scripts/local_smoke_test.py](D:\Code\NIM2response\scripts\local_smoke_test.py) 验证了协议转换、官方模型同步、用户 Key 鉴权、`previous_response_id`、tool call 与前端数据接口。
+2. 真实上游联调：
+   - 通过 [scripts/live_e2e_validation.py](D:\Code\NIM2response\scripts\live_e2e_validation.py) 使用你提供的测试 NIM Key，真实调用了 NVIDIA 官方模型目录和实际模型响应。
+   - 实测结果：`live_gateway_ok`，并成功通过 `z-ai/glm5` 得到 `OK`。
+## 部署到 Hugging Face Space
+1. 新建 Hugging Face Space，SDK 选择 `Docker`
+2. 将 `hf_space` 目录内的内容作为 Space 根目录上传
+3. 按需配置 `MODEL_LIST` 等环境变量
+4. 启动后即可直接公开使用
+## 参考资料
+- OpenAI Responses API: https://platform.openai.com/docs/guides/responses-vs-chat-completions
+- OpenAI Function Calling: https://platform.openai.com/docs/guides/function-calling
+- NVIDIA Build: https://build.nvidia.com/
 - NVIDIA NIM API 文档: https://docs.api.nvidia.com/

app/main.py CHANGED Viewed

@@ -1,6 +1,7 @@
-from __future__ import annotations
 import asyncio
 import json
 import os
 import sqlite3
@@ -12,11 +13,10 @@ from pathlib import Path
 from typing import Any
 import httpx
-from apscheduler.schedulers.asyncio import AsyncIOScheduler
-from fastapi import Depends, FastAPI, Header, HTTPException, Request, Response, status
-from fastapi.responses import HTMLResponse, JSONResponse, StreamingResponse
 from fastapi.staticfiles import StaticFiles
-from itsdangerous import BadSignature, SignatureExpired, URLSafeTimedSerializer
 BASE_DIR = Path(__file__).resolve().parent.parent
@@ -26,31 +26,20 @@ RAW_NVIDIA_API_BASE = os.getenv("NVIDIA_API_BASE", os.getenv("NIM_BASE_URL", "ht
 NVIDIA_API_BASE = RAW_NVIDIA_API_BASE if RAW_NVIDIA_API_BASE.endswith("/v1") else f"{RAW_NVIDIA_API_BASE}/v1"
 CHAT_COMPLETIONS_URL = f"{NVIDIA_API_BASE}/chat/completions"
 MODELS_URL = f"{NVIDIA_API_BASE}/models"
-ADMIN_PASSWORD = os.getenv("PASSWORD")
-SESSION_SECRET = os.getenv("SESSION_SECRET") or ADMIN_PASSWORD or "nim-responses-dev-secret"
-COOKIE_NAME = os.getenv("COOKIE_NAME", "nim_admin_session")
-PASS_API_KEY = os.getenv("PASS_APIKEY") or os.getenv("GATEWAY_API_KEY")
-DEFAULT_ENV_KEY = os.getenv("NVIDIA_NIM_API_KEY") or os.getenv("NVIDIA_API_KEY")
 REQUEST_TIMEOUT_SECONDS = float(os.getenv("REQUEST_TIMEOUT_SECONDS", "90"))
-DEFAULT_HEALTH_INTERVAL_MINUTES = int(os.getenv("HEALTHCHECK_INTERVAL_MINUTES", "60"))
-DEFAULT_HEALTH_PROMPT = os.getenv("HEALTHCHECK_PROMPT", "请只回复 OK。")
-PUBLIC_HISTORY_HOURS = int(os.getenv("PUBLIC_HISTORY_HOURS", "48"))
-MAX_UPSTREAM_CONNECTIONS = int(os.getenv("MAX_UPSTREAM_CONNECTIONS", "256"))
-MAX_KEEPALIVE_CONNECTIONS = int(os.getenv("MAX_KEEPALIVE_CONNECTIONS", "64"))
-DEFAULT_MODELS = [
-    ("z-ai/glm5", "GLM-5", "Reasoning and general assistant model from Z.ai", 10, 1),
-    ("minimaxai/minimax-m2.5", "MiniMax M2.5", "Long-context assistant model from MiniMax", 20, 1),
-    ("moonshotai/kimi-k2.5", "Kimi K2.5", "Kimi family model tuned for tool use and code", 30, 1),
-    ("deepseek-ai/deepseek-v3.2", "DeepSeek V3.2", "DeepSeek production general-purpose model", 40, 1),
-    ("google/gemma-4-31b-it", "Gemma 4 31B IT", "Instruction-tuned Gemma model", 50, 0),
-    ("qwen/qwen3.5-397b-a17b", "Qwen 3.5 397B A17B", "Large-scale Qwen model with broad capabilities", 60, 0),
-]
-scheduler = AsyncIOScheduler(timezone="UTC")
 http_client: httpx.AsyncClient | None = None
-api_key_selection_lock: asyncio.Lock | None = None
-api_key_rr_index = 0
 def utcnow() -> datetime:
@@ -61,48 +50,38 @@ def utcnow_iso() -> str:
     return utcnow().isoformat()
-def parse_datetime(value: str | None) -> datetime | None:
-    if not value:
-        return None
-    try:
-        return datetime.fromisoformat(value)
-    except ValueError:
-        return None
-def bool_value(value: Any) -> bool:
-    if isinstance(value, bool):
-        return value
-    if isinstance(value, (int, float)):
-        return bool(value)
-    if value is None:
-        return False
-    return str(value).strip().lower() in {"1", "true", "yes", "on", "enabled"}
-def json_dumps(value: Any) -> str:
-    return json.dumps(value, ensure_ascii=False)
-async def get_http_client() -> httpx.AsyncClient:
-    global http_client
-    if http_client is None or http_client.is_closed:
-        limits = httpx.Limits(
-            max_connections=MAX_UPSTREAM_CONNECTIONS,
-            max_keepalive_connections=MAX_KEEPALIVE_CONNECTIONS,
-        )
-        http_client = httpx.AsyncClient(timeout=REQUEST_TIMEOUT_SECONDS, limits=limits)
-    return http_client
-async def get_api_key_selection_lock() -> asyncio.Lock:
-    global api_key_selection_lock
-    if api_key_selection_lock is None:
-        api_key_selection_lock = asyncio.Lock()
-    return api_key_selection_lock
 def get_db_connection() -> sqlite3.Connection:
     conn = sqlite3.connect(DB_PATH, check_same_thread=False, timeout=30.0)
     conn.row_factory = sqlite3.Row
     conn.execute("PRAGMA journal_mode=WAL")
@@ -113,289 +92,198 @@ def get_db_connection() -> sqlite3.Connection:
 def init_db() -> None:
-    DB_PATH.parent.mkdir(parents=True, exist_ok=True)
     conn = get_db_connection()
     try:
         conn.executescript(
             """
-            CREATE TABLE IF NOT EXISTS proxy_models (
-                id INTEGER PRIMARY KEY AUTOINCREMENT,
-                model_id TEXT UNIQUE NOT NULL,
-                display_name TEXT NOT NULL,
-                provider TEXT NOT NULL DEFAULT 'nvidia-nim',
-                description TEXT,
-                enabled INTEGER NOT NULL DEFAULT 1,
-                featured INTEGER NOT NULL DEFAULT 0,
-                sort_order INTEGER NOT NULL DEFAULT 0,
-                request_count INTEGER NOT NULL DEFAULT 0,
-                success_count INTEGER NOT NULL DEFAULT 0,
-                failure_count INTEGER NOT NULL DEFAULT 0,
-                healthcheck_count INTEGER NOT NULL DEFAULT 0,
-                healthcheck_success_count INTEGER NOT NULL DEFAULT 0,
-                last_used_at TEXT,
-                last_healthcheck_at TEXT,
-                last_health_status INTEGER,
-                last_latency_ms REAL,
-                created_at TEXT NOT NULL,
-                updated_at TEXT NOT NULL
-            );
-            CREATE TABLE IF NOT EXISTS api_keys (
-                id INTEGER PRIMARY KEY AUTOINCREMENT,
-                name TEXT UNIQUE NOT NULL,
-                api_key TEXT NOT NULL,
-                enabled INTEGER NOT NULL DEFAULT 1,
-                request_count INTEGER NOT NULL DEFAULT 0,
-                success_count INTEGER NOT NULL DEFAULT 0,
-                failure_count INTEGER NOT NULL DEFAULT 0,
-                healthcheck_count INTEGER NOT NULL DEFAULT 0,
-                healthcheck_success_count INTEGER NOT NULL DEFAULT 0,
-                last_used_at TEXT,
-                last_tested_at TEXT,
-                last_latency_ms REAL,
-                created_at TEXT NOT NULL,
-                updated_at TEXT NOT NULL
-            );
             CREATE TABLE IF NOT EXISTS response_records (
-                id INTEGER PRIMARY KEY AUTOINCREMENT,
-                response_id TEXT UNIQUE NOT NULL,
                 parent_response_id TEXT,
-                model_id INTEGER,
-                api_key_id INTEGER,
                 request_json TEXT NOT NULL,
                 input_items_json TEXT NOT NULL,
                 output_json TEXT NOT NULL,
                 output_items_json TEXT NOT NULL,
                 status TEXT NOT NULL,
                 created_at TEXT NOT NULL
             );
-            CREATE TABLE IF NOT EXISTS health_check_records (
-                id INTEGER PRIMARY KEY AUTOINCREMENT,
-                model_id INTEGER NOT NULL,
-                api_key_id INTEGER,
-                ok INTEGER NOT NULL,
-                status_code INTEGER,
-                latency_ms REAL,
-                error_message TEXT,
-                response_excerpt TEXT,
-                checked_at TEXT NOT NULL
             );
-            CREATE TABLE IF NOT EXISTS settings (
-                key TEXT PRIMARY KEY,
-                value TEXT NOT NULL
             );
             """
         )
-        now = utcnow_iso()
-        for model_id, display_name, description, sort_order, featured in DEFAULT_MODELS:
-            conn.execute(
-                """
-                INSERT OR IGNORE INTO proxy_models (
-                    model_id, display_name, provider, description, enabled, featured, sort_order, created_at, updated_at
-                ) VALUES (?, ?, 'nvidia-nim', ?, 1, ?, ?, ?, ?)
-                """,
-                (model_id, display_name, description, featured, sort_order, now, now),
-            )
-        defaults = {
-            "healthcheck_enabled": "true",
-            "healthcheck_interval_minutes": str(DEFAULT_HEALTH_INTERVAL_MINUTES),
-            "healthcheck_prompt": DEFAULT_HEALTH_PROMPT,
-            "public_history_hours": str(PUBLIC_HISTORY_HOURS),
-        }
-        for key, value in defaults.items():
-            conn.execute("INSERT OR IGNORE INTO settings (key, value) VALUES (?, ?)", (key, value))
-        if DEFAULT_ENV_KEY:
-            conn.execute(
-                """
-                INSERT OR IGNORE INTO api_keys (name, api_key, enabled, created_at, updated_at)
-                VALUES ('env-default', ?, 1, ?, ?)
-                """,
-                (DEFAULT_ENV_KEY, now, now),
-            )
         conn.commit()
     finally:
         conn.close()
-def get_setting(conn: sqlite3.Connection, key: str, default: str) -> str:
-    row = conn.execute("SELECT value FROM settings WHERE key = ?", (key,)).fetchone()
-    return row["value"] if row else default
-def set_setting(conn: sqlite3.Connection, key: str, value: str) -> None:
-    conn.execute(
-        """
-        INSERT INTO settings (key, value) VALUES (?, ?)
-        ON CONFLICT(key) DO UPDATE SET value = excluded.value
-        """,
-        (key, value),
-    )
-def get_settings_payload(conn: sqlite3.Connection) -> dict[str, Any]:
-    return {
-        "healthcheck_enabled": bool_value(get_setting(conn, "healthcheck_enabled", "true")),
-        "healthcheck_interval_minutes": int(get_setting(conn, "healthcheck_interval_minutes", str(DEFAULT_HEALTH_INTERVAL_MINUTES))),
-        "healthcheck_prompt": get_setting(conn, "healthcheck_prompt", DEFAULT_HEALTH_PROMPT),
-        "public_history_hours": int(get_setting(conn, "public_history_hours", str(PUBLIC_HISTORY_HOURS))),
-    }
-def mask_secret(secret: str) -> str:
-    if len(secret) <= 8:
-        return f"{secret[:2]}***"
-    return f"{secret[:4]}...{secret[-4:]}"
-def create_admin_token() -> str:
-    serializer = URLSafeTimedSerializer(SESSION_SECRET, salt="nim-admin-auth")
-    return serializer.dumps({"role": "admin"})
-def verify_admin_token(token: str) -> bool:
-    serializer = URLSafeTimedSerializer(SESSION_SECRET, salt="nim-admin-auth")
     try:
-        payload = serializer.loads(token, max_age=60 * 60 * 24 * 7)
-    except (BadSignature, SignatureExpired):
-        return False
-    return payload.get("role") == "admin"
-def require_admin(request: Request, authorization: str | None = Header(default=None)) -> bool:
-    token: str | None = None
-    if authorization and authorization.startswith("Bearer "):
-        token = authorization.removeprefix("Bearer ").strip()
-    if not token:
-        token = request.cookies.get(COOKIE_NAME)
-    if not token or not verify_admin_token(token):
-        raise HTTPException(status_code=status.HTTP_401_UNAUTHORIZED, detail="需要管理员登录。")
-    return True
-def require_proxy_token_if_configured(authorization: str | None = Header(default=None), x_api_key: str | None = Header(default=None)) -> bool:
-    if not PASS_API_KEY:
-        return True
     token: str | None = None
     if authorization and authorization.startswith("Bearer "):
         token = authorization.removeprefix("Bearer ").strip()
     elif x_api_key:
         token = x_api_key.strip()
     if not token:
-        raise HTTPException(status_code=status.HTTP_401_UNAUTHORIZED, detail="缺少 API 鉴权信息。")
-    if token != PASS_API_KEY:
-        raise HTTPException(status_code=status.HTTP_401_UNAUTHORIZED, detail="API 鉴权失败。")
-    return True
-def fetch_model_by_identifier(conn: sqlite3.Connection, identifier: str | int, enabled_only: bool = False) -> sqlite3.Row | None:
-    clause = "AND enabled = 1" if enabled_only else ""
-    if isinstance(identifier, int) or (isinstance(identifier, str) and identifier.isdigit()):
-        row = conn.execute(f"SELECT * FROM proxy_models WHERE id = ? {clause}", (int(identifier),)).fetchone()
-        if row:
-            return row
-    return conn.execute(f"SELECT * FROM proxy_models WHERE model_id = ? {clause}", (str(identifier),)).fetchone()
-def fetch_key_by_identifier(conn: sqlite3.Connection, identifier: str | int, enabled_only: bool = False) -> sqlite3.Row | None:
-    clause = "AND enabled = 1" if enabled_only else ""
-    if isinstance(identifier, int) or (isinstance(identifier, str) and str(identifier).isdigit()):
-        row = conn.execute(f"SELECT * FROM api_keys WHERE id = ? {clause}", (int(identifier),)).fetchone()
-        if row:
-            return row
-    return conn.execute(f"SELECT * FROM api_keys WHERE name = ? {clause}", (str(identifier),)).fetchone()
-async def select_api_key(conn: sqlite3.Connection, explicit_id: int | None = None) -> sqlite3.Row:
-    if explicit_id is not None:
-        row = fetch_key_by_identifier(conn, explicit_id, enabled_only=True)
-        if row:
-            return row
-    key_rows = conn.execute(
-        """
-        SELECT * FROM api_keys
-        WHERE enabled = 1
-        ORDER BY id ASC
-        """
-    ).fetchall()
-    if not key_rows:
-        raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="??????? NVIDIA NIM Key?")
-    global api_key_rr_index
-    lock = await get_api_key_selection_lock()
-    async with lock:
-        selected = key_rows[api_key_rr_index % len(key_rows)]
-        api_key_rr_index = (api_key_rr_index + 1) % len(key_rows)
-    return selected
-def row_to_model_item(row: sqlite3.Row) -> dict[str, Any]:
-    status_name = "unknown"
-    if row["last_health_status"] is not None:
-        status_name = "healthy" if bool(row["last_health_status"]) else "down"
-    return {
-        "id": row["id"],
-        "model_id": row["model_id"],
-        "name": row["model_id"],
-        "display_name": row["display_name"],
-        "endpoint": "/v1/responses",
-        "provider": row["provider"],
-        "description": row["description"],
-        "enabled": bool(row["enabled"]),
-        "featured": bool(row["featured"]),
-        "sort_order": row["sort_order"],
-        "status": status_name,
-        "request_count": row["request_count"],
-        "success_count": row["success_count"],
-        "failure_count": row["failure_count"],
-        "healthcheck_count": row["healthcheck_count"],
-        "healthcheck_success_count": row["healthcheck_success_count"],
-        "last_used_at": row["last_used_at"],
-        "last_healthcheck_at": row["last_healthcheck_at"],
-        "last_health_status": None if row["last_health_status"] is None else bool(row["last_health_status"]),
-        "last_latency_ms": row["last_latency_ms"],
-        "created_at": row["created_at"],
-        "updated_at": row["updated_at"],
-    }
-def row_to_key_item(row: sqlite3.Row) -> dict[str, Any]:
-    total_checks = row["healthcheck_count"] or 0
-    ok_checks = row["healthcheck_success_count"] or 0
-    success_ratio = (ok_checks / total_checks) if total_checks else None
-    status_name = "healthy" if success_ratio and success_ratio >= 0.8 else "unknown"
-    return {
-        "id": row["id"],
-        "name": row["name"],
-        "label": row["name"],
-        "masked_key": mask_secret(row["api_key"]),
-        "enabled": bool(row["enabled"]),
-        "status": status_name,
-        "request_count": row["request_count"],
-        "success_count": row["success_count"],
-        "failure_count": row["failure_count"],
-        "healthcheck_count": row["healthcheck_count"],
-        "healthcheck_success_count": row["healthcheck_success_count"],
-        "last_used_at": row["last_used_at"],
-        "last_tested": row["last_tested_at"],
-        "last_tested_at": row["last_tested_at"],
-        "last_latency_ms": row["last_latency_ms"],
-        "created_at": row["created_at"],
-        "updated_at": row["updated_at"],
-    }
-def make_error(status_code: int, message: str, error_type: str = "invalid_request_error") -> JSONResponse:
-    return JSONResponse(
-        status_code=status_code,
-        content={"error": {"message": message, "type": error_type, "code": status_code}},
-    )
 def normalize_content(content: Any, role: str) -> list[dict[str, Any]]:
     if content is None:
@@ -440,7 +328,6 @@ def normalize_input_items(value: Any) -> list[dict[str, Any]]:
         if not isinstance(item, dict):
             items.append({"type": "message", "role": "user", "content": [{"type": "input_text", "text": str(item)}]})
             continue
         item_type = item.get("type")
         if item_type == "message" or item.get("role"):
             role = item.get("role", "user")
@@ -456,14 +343,12 @@ def normalize_input_items(value: Any) -> list[dict[str, Any]]:
             arguments = item.get("arguments", "{}")
             if not isinstance(arguments, str):
                 arguments = json_dumps(arguments)
-            items.append(
-                {
-                    "type": "function_call",
-                    "call_id": item.get("call_id") or f"call_{uuid.uuid4().hex[:12]}",
-                    "name": item.get("name"),
-                    "arguments": arguments,
-                }
-            )
             continue
         if item_type in {"input_text", "output_text", "text"}:
             items.append({"type": "message", "role": "user", "content": [{"type": "input_text", "text": item.get("text", "")}]})
@@ -492,25 +377,6 @@ def extract_text_from_content(content: Any) -> str:
     return str(content)
-def load_previous_conversation_items(conn: sqlite3.Connection, previous_response_id: str | None) -> list[dict[str, Any]]:
-    if not previous_response_id:
-        return []
-    records: list[sqlite3.Row] = []
-    current = previous_response_id
-    while current:
-        row = conn.execute("SELECT * FROM response_records WHERE response_id = ?", (current,)).fetchone()
-        if not row:
-            raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail=f"previous_response_id '{current}' was not found.")
-        records.append(row)
-        current = row["parent_response_id"]
-    items: list[dict[str, Any]] = []
-    for row in reversed(records):
-        items.extend(json.loads(row["input_items_json"]))
-        items.extend(json.loads(row["output_items_json"]))
-    return items
 def items_to_chat_messages(items: list[dict[str, Any]]) -> list[dict[str, Any]]:
     messages: list[dict[str, Any]] = []
     pending_tool_calls: list[dict[str, Any]] = []
@@ -656,7 +522,11 @@ def extract_text_and_tool_calls(message: dict[str, Any]) -> tuple[str, list[dict
                 arguments = part.get("arguments") or "{}"
                 if not isinstance(arguments, str):
                     arguments = json_dumps(arguments)
-                tool_calls.append({"id": part.get("id") or part.get("call_id") or f"call_{uuid.uuid4().hex[:12]}", "name": part.get("name"), "arguments": arguments})
     for tool_call in message.get("tool_calls") or []:
         if not isinstance(tool_call, dict):
@@ -665,7 +535,13 @@ def extract_text_and_tool_calls(message: dict[str, Any]) -> tuple[str, list[dict
         arguments = function_data.get("arguments") or tool_call.get("arguments") or "{}"
         if not isinstance(arguments, str):
             arguments = json_dumps(arguments)
-        tool_calls.append({"id": tool_call.get("id") or f"call_{uuid.uuid4().hex[:12]}", "name": function_data.get("name") or tool_call.get("name"), "arguments": arguments})
     deduped: list[dict[str, Any]] = []
     seen_ids: set[str] = set()
@@ -676,7 +552,6 @@ def extract_text_and_tool_calls(message: dict[str, Any]) -> tuple[str, list[dict
         deduped.append(tool_call)
     return "\n".join(filter(None, text_chunks)).strip(), deduped
 def build_choice_alias(output_items: list[dict[str, Any]], finish_reason: str | None) -> list[dict[str, Any]]:
     content_parts: list[dict[str, Any]] = []
     for item in output_items:
@@ -699,9 +574,22 @@ def chat_completion_to_response(body: dict[str, Any], upstream_json: dict[str, A
     response_id = upstream_json.get("id") or f"resp_{uuid.uuid4().hex}"
     output_items: list[dict[str, Any]] = []
     if assistant_text:
-        output_items.append({"id": f"msg_{uuid.uuid4().hex[:24]}", "type": "message", "status": "completed", "role": "assistant", "content": [{"type": "output_text", "text": assistant_text, "annotations": []}]})
     for tool_call in tool_calls:
-        output_items.append({"id": f"fc_{uuid.uuid4().hex[:24]}", "type": "function_call", "status": "completed", "call_id": tool_call["id"], "name": tool_call.get("name"), "arguments": tool_call.get("arguments", "{}")})
     usage = upstream_json.get("usage") or {}
     return {
         "id": response_id,
@@ -715,680 +603,366 @@ def chat_completion_to_response(body: dict[str, Any], upstream_json: dict[str, A
         "previous_response_id": previous_response_id,
         "store": True,
         "text": body.get("text") or {"format": {"type": "text"}},
-        "usage": {"input_tokens": usage.get("prompt_tokens"), "output_tokens": usage.get("completion_tokens"), "total_tokens": usage.get("total_tokens")},
         "choices": build_choice_alias(output_items, finish_reason),
-        "upstream": {"id": upstream_json.get("id"), "object": upstream_json.get("object", "chat.completion"), "finish_reason": finish_reason or "stop"},
     }
-def store_response_record(conn: sqlite3.Connection, response_payload: dict[str, Any], request_body: dict[str, Any], input_items: list[dict[str, Any]], model_row: sqlite3.Row, api_key_row: sqlite3.Row) -> None:
-    conn.execute(
-        """
-        INSERT OR REPLACE INTO response_records (
-            response_id, parent_response_id, model_id, api_key_id, request_json,
-            input_items_json, output_json, output_items_json, status, created_at
-        ) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
-        """,
-        (
-            response_payload["id"],
-            request_body.get("previous_response_id"),
-            model_row["id"],
-            api_key_row["id"],
-            json_dumps(request_body),
-            json_dumps(input_items),
-            json_dumps(response_payload),
-            json_dumps(response_payload.get("output") or []),
-            response_payload.get("status", "completed"),
-            utcnow_iso(),
-        ),
-    )
-def update_usage_stats(conn: sqlite3.Connection, model_row: sqlite3.Row, api_key_row: sqlite3.Row, *, ok: bool, latency_ms: float | None, is_healthcheck: bool) -> None:
-    now = utcnow_iso()
-    if is_healthcheck:
         conn.execute(
             """
-            UPDATE proxy_models
-            SET healthcheck_count = healthcheck_count + 1,
-                healthcheck_success_count = healthcheck_success_count + ?,
-                last_healthcheck_at = ?,
-                last_health_status = ?,
-                last_latency_ms = ?,
-                updated_at = ?
-            WHERE id = ?
             """,
-            (1 if ok else 0, now, 1 if ok else 0, latency_ms, now, model_row["id"]),
         )
         conn.execute(
             """
-            UPDATE api_keys
-            SET healthcheck_count = healthcheck_count + 1,
-                healthcheck_success_count = healthcheck_success_count + ?,
-                last_tested_at = ?,
-                last_latency_ms = ?,
                 updated_at = ?
-            WHERE id = ?
             """,
-            (1 if ok else 0, now, latency_ms, now, api_key_row["id"]),
         )
-        return
-    conn.execute(
-        """
-        UPDATE proxy_models
-        SET request_count = request_count + 1,
-            success_count = success_count + ?,
-            failure_count = failure_count + ?,
-            last_used_at = ?,
-            last_latency_ms = ?,
-            updated_at = ?
-        WHERE id = ?
-        """,
-        (1 if ok else 0, 0 if ok else 1, now, latency_ms, now, model_row["id"]),
-    )
-    conn.execute(
-        """
-        UPDATE api_keys
-        SET request_count = request_count + 1,
-            success_count = success_count + ?,
-            failure_count = failure_count + ?,
-            last_used_at = ?,
-            last_latency_ms = ?,
-            updated_at = ?
-        WHERE id = ?
-        """,
-        (1 if ok else 0, 0 if ok else 1, now, latency_ms, now, api_key_row["id"]),
-    )
-def insert_health_record(conn: sqlite3.Connection, model_row: sqlite3.Row, api_key_row: sqlite3.Row, *, ok: bool, status_code: int | None, latency_ms: float | None, error_message: str | None, response_excerpt: str | None) -> None:
-    conn.execute(
-        """
-        INSERT INTO health_check_records (
-            model_id, api_key_id, ok, status_code, latency_ms, error_message, response_excerpt, checked_at
-        ) VALUES (?, ?, ?, ?, ?, ?, ?, ?)
-        """,
-        (model_row["id"], api_key_row["id"], 1 if ok else 0, status_code, latency_ms, error_message, response_excerpt, utcnow_iso()),
-    )
-async def post_nvidia_chat_completion(api_key: str, payload: dict[str, Any]) -> tuple[dict[str, Any], float]:
-    started = time.perf_counter()
-    client = await get_http_client()
-    response = await client.post(
-        CHAT_COMPLETIONS_URL,
-        headers={"Authorization": f"Bearer {api_key}", "Content-Type": "application/json"},
-        json=payload,
-    )
-    latency_ms = round((time.perf_counter() - started) * 1000, 2)
-    if response.status_code >= 400:
-        try:
-            error_payload = response.json()
-            detail = error_payload.get("error", {}).get("message") or json_dumps(error_payload)
-        except Exception:
-            detail = response.text
-        raise HTTPException(status_code=response.status_code, detail=f"NVIDIA NIM 请求失败：{detail}")
-    return response.json(), latency_ms
-async def perform_healthcheck(conn: sqlite3.Connection, model_row: sqlite3.Row, api_key_row: sqlite3.Row, prompt: str) -> dict[str, Any]:
-    payload = {"model": model_row["model_id"], "messages": [{"role": "user", "content": prompt}], "max_tokens": 32, "temperature": 0}
-    try:
-        upstream_json, latency_ms = await post_nvidia_chat_completion(api_key_row["api_key"], payload)
-        message, _finish_reason = extract_upstream_message(upstream_json)
-        assistant_text, _tool_calls = extract_text_and_tool_calls(message)
-        ok = True
-        detail = assistant_text or "模型响应正常。"
-        status_code = 200
-        error_message = None
-        response_excerpt = detail[:200]
-    except HTTPException as exc:
-        ok = False
-        latency_ms = None
-        detail = exc.detail
-        status_code = exc.status_code
-        error_message = exc.detail
-        response_excerpt = None
-    update_usage_stats(conn, model_row, api_key_row, ok=ok, latency_ms=latency_ms, is_healthcheck=True)
-    insert_health_record(conn, model_row, api_key_row, ok=ok, status_code=status_code, latency_ms=latency_ms, error_message=error_message, response_excerpt=response_excerpt)
-    conn.commit()
-    return {"model": model_row["model_id"], "display_name": model_row["display_name"], "api_key": api_key_row["name"], "status": "healthy" if ok else "down", "ok": ok, "latency": latency_ms, "status_code": status_code, "detail": detail, "checked_at": utcnow_iso()}
-async def run_healthchecks(model_identifier: str | int | None = None, api_key_identifier: str | int | None = None, prompt: str | None = None) -> list[dict[str, Any]]:
-    conn = get_db_connection()
-    try:
-        settings_payload = get_settings_payload(conn)
-        effective_prompt = prompt or settings_payload["healthcheck_prompt"]
-        if api_key_identifier is not None:
-            api_key_row = fetch_key_by_identifier(conn, api_key_identifier, enabled_only=True)
-            if not api_key_row:
-                raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="未找到 API Key。")
-            key_rows = [api_key_row]
-        else:
-            key_rows = conn.execute("SELECT * FROM api_keys WHERE enabled = 1 ORDER BY id ASC").fetchall()
-        if not key_rows:
-            raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="??????? NVIDIA NIM Key?")
-        if model_identifier is not None:
-            model_row = fetch_model_by_identifier(conn, model_identifier, enabled_only=True)
-            if not model_row:
-                raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="未找到模型。")
-            model_rows = [model_row]
-        else:
-            model_rows = conn.execute("SELECT * FROM proxy_models WHERE enabled = 1 ORDER BY sort_order ASC, model_id ASC").fetchall()
-        results: list[dict[str, Any]] = []
-        for index, model_row in enumerate(model_rows):
-            api_key_row = key_rows[index % len(key_rows)]
-            results.append(await perform_healthcheck(conn, model_row, api_key_row, effective_prompt))
-        return results
-    finally:
-        conn.close()
-def build_public_health_payload(hours: int | None = None) -> dict[str, Any]:
-    conn = get_db_connection()
-    try:
-        settings_payload = get_settings_payload(conn)
-        effective_hours = hours or settings_payload["public_history_hours"]
-        since = utcnow() - timedelta(hours=effective_hours)
-        models = conn.execute("SELECT * FROM proxy_models WHERE enabled = 1 ORDER BY sort_order ASC, model_id ASC").fetchall()
-        result_models: list[dict[str, Any]] = []
-        last_updated: str | None = None
-        for model in models:
-            rows = conn.execute("SELECT * FROM health_check_records WHERE model_id = ? AND checked_at >= ? ORDER BY checked_at ASC", (model["id"], since.isoformat())).fetchall()
-            hourly = []
-            ok_count = 0
-            for row in rows:
-                status_name = "healthy" if row["ok"] else "down"
-                hourly.append({"time": row["checked_at"], "status": status_name, "latency": row["latency_ms"]})
-                ok_count += 1 if row["ok"] else 0
-                last_updated = row["checked_at"]
-            total = len(rows)
-            success_rate = round((ok_count / total) * 100, 1) if total else 0.0
-            model_status = "unknown" if model["last_health_status"] is None else ("healthy" if model["last_health_status"] else "down")
-            result_models.append({"id": model["id"], "model_id": model["model_id"], "name": model["display_name"], "display_name": model["display_name"], "endpoint": "/v1/responses", "status": model_status, "beat": f"{success_rate}%", "hourly": hourly, "last_health_status": None if model["last_health_status"] is None else bool(model["last_health_status"]), "last_healthcheck_at": model["last_healthcheck_at"], "success_rate": success_rate, "points": [{"hour": entry["time"], "label": parse_datetime(entry["time"]).strftime("%H:%M") if parse_datetime(entry["time"]) else entry["time"], "ok": entry["status"] == "healthy", "latency_ms": entry["latency"]} for entry in hourly]})
-        return {"generated_at": utcnow_iso(), "last_updated": last_updated, "hours": effective_hours, "models": result_models}
-    finally:
-        conn.close()
-def schedule_healthchecks() -> None:
-    conn = get_db_connection()
-    try:
-        settings_payload = get_settings_payload(conn)
-    finally:
-        conn.close()
-    interval = max(5, int(settings_payload["healthcheck_interval_minutes"]))
-    enabled = bool(settings_payload["healthcheck_enabled"])
-    if scheduler.get_job("nim-hourly-healthcheck"):
-        scheduler.remove_job("nim-hourly-healthcheck")
-    if enabled:
-        scheduler.add_job(run_healthchecks, "interval", minutes=interval, id="nim-hourly-healthcheck", replace_existing=True, next_run_time=utcnow() + timedelta(seconds=10))
-init_db()
-@asynccontextmanager
-async def lifespan(_app: FastAPI):
-    global http_client, api_key_selection_lock, api_key_rr_index
-    init_db()
-    api_key_selection_lock = asyncio.Lock()
-    api_key_rr_index = 0
-    http_client = await get_http_client()
-    if not scheduler.running:
-        scheduler.start()
-    schedule_healthchecks()
-    try:
-        yield
-    finally:
-        if scheduler.running:
-            scheduler.shutdown(wait=False)
-        if http_client is not None and not http_client.is_closed:
-            await http_client.aclose()
-        http_client = None
-        api_key_selection_lock = None
-app = FastAPI(title="NIM 响应网关", lifespan=lifespan)
-app.mount("/static", StaticFiles(directory=str(STATIC_DIR)), name="static")
-def render_html(filename: str) -> HTMLResponse:
-    content = (STATIC_DIR / filename).read_text(encoding="utf-8")
-    return HTMLResponse(content=content, media_type="text/html; charset=utf-8")
-@app.get("/")
-async def public_dashboard() -> HTMLResponse:
-    return render_html("index.html")
-@app.get("/admin")
-async def admin_dashboard() -> HTMLResponse:
-    return render_html("admin.html")
-@app.get("/api/health/public")
-async def public_health(hours: int | None = None) -> dict[str, Any]:
-    return build_public_health_payload(hours)
-@app.get("/v1/models")
-async def list_models() -> dict[str, Any]:
-    conn = get_db_connection()
-    try:
-        rows = conn.execute("SELECT * FROM proxy_models WHERE enabled = 1 ORDER BY sort_order ASC, model_id ASC").fetchall()
-        data = [{"id": row["model_id"], "object": "model", "created": 0, "owned_by": "nvidia-nim", "display_name": row["display_name"], "status": ("unknown" if row["last_health_status"] is None else ("healthy" if row["last_health_status"] else "down"))} for row in rows]
-        return {"object": "list", "data": data, "models": data}
-    finally:
-        conn.close()
-@app.get("/v1/responses/{response_id}")
-async def get_response(response_id: str, _: bool = Depends(require_proxy_token_if_configured)):
-    conn = get_db_connection()
-    try:
-        row = conn.execute("SELECT output_json FROM response_records WHERE response_id = ?", (response_id,)).fetchone()
-        if not row:
-            raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="Response not found.")
-        return json.loads(row["output_json"])
-    finally:
-        conn.close()
-@app.post("/v1/responses")
-async def create_response(request: Request, _: bool = Depends(require_proxy_token_if_configured)):
-    body = await request.json()
-    if not isinstance(body, dict):
-        return make_error(status.HTTP_400_BAD_REQUEST, "?????? JSON ???")
-    if not body.get("model"):
-        return make_error(status.HTTP_400_BAD_REQUEST, "?? model ???")
-    if body.get("input") is None:
-        return make_error(status.HTTP_400_BAD_REQUEST, "?? input ???")
-    conn = get_db_connection()
-    try:
-        model_row = fetch_model_by_identifier(conn, body["model"], enabled_only=True)
-        if not model_row:
-            return make_error(status.HTTP_404_NOT_FOUND, f"?? {body['model']} ????????")
-        api_key_row = await select_api_key(conn)
-        previous_items = load_previous_conversation_items(conn, body.get("previous_response_id"))
-        input_items = normalize_input_items(body.get("input"))
-        merged_items = previous_items + input_items
-        chat_payload = build_chat_payload(body, merged_items)
-        try:
-            upstream_json, latency_ms = await post_nvidia_chat_completion(api_key_row["api_key"], chat_payload)
-        except HTTPException as exc:
-            update_usage_stats(conn, model_row, api_key_row, ok=False, latency_ms=None, is_healthcheck=False)
-            conn.commit()
-            raise exc
-        response_payload = chat_completion_to_response(body, upstream_json, body.get("previous_response_id"))
-        update_usage_stats(conn, model_row, api_key_row, ok=True, latency_ms=latency_ms, is_healthcheck=False)
-        store_response_record(conn, response_payload, body, input_items, model_row, api_key_row)
         conn.commit()
-        if body.get("stream"):
-            async def event_stream() -> Any:
-                yield f"event: response.created\ndata: {json_dumps({'type': 'response.created', 'response': {'id': response_payload['id'], 'model': response_payload['model'], 'status': 'in_progress'}})}\n\n"
-                for index, item in enumerate(response_payload.get("output") or []):
-                    yield f"event: response.output_item.added\ndata: {json_dumps({'type': 'response.output_item.added', 'output_index': index, 'item': item})}\n\n"
-                    if item.get("type") == "message":
-                        text_value = extract_text_from_content(item.get("content"))
-                        if text_value:
-                            yield f"event: response.output_text.delta\ndata: {json_dumps({'type': 'response.output_text.delta', 'output_index': index, 'delta': text_value})}\n\n"
-                            yield f"event: response.output_text.done\ndata: {json_dumps({'type': 'response.output_text.done', 'output_index': index, 'text': text_value})}\n\n"
-                    if item.get("type") == "function_call":
-                        yield f"event: response.function_call_arguments.done\ndata: {json_dumps({'type': 'response.function_call_arguments.done', 'output_index': index, 'arguments': item.get('arguments', '{}'), 'call_id': item.get('call_id')})}\n\n"
-                    yield f"event: response.output_item.done\ndata: {json_dumps({'type': 'response.output_item.done', 'output_index': index, 'item': item})}\n\n"
-                yield f"event: response.completed\ndata: {json_dumps({'type': 'response.completed', 'response': response_payload})}\n\n"
-            return StreamingResponse(event_stream(), media_type="text/event-stream")
-        return response_payload
-    finally:
-        conn.close()
-@app.post("/admin/api/login")
-async def admin_login(request: Request, response: Response):
-    if not ADMIN_PASSWORD:
-        raise HTTPException(status_code=status.HTTP_503_SERVICE_UNAVAILABLE, detail="尚未配置 PASSWORD 环境变量。")
-    body = await request.json()
-    password = body.get("password") if isinstance(body, dict) else None
-    if password != ADMIN_PASSWORD:
-        raise HTTPException(status_code=status.HTTP_401_UNAUTHORIZED, detail="密码错误。")
-    token = create_admin_token()
-    response.set_cookie(COOKIE_NAME, token, httponly=True, samesite="lax", secure=False, max_age=60 * 60 * 24 * 7)
-    return {"token": token, "access_token": token, "token_type": "bearer"}
-@app.post("/admin/api/logout")
-async def admin_logout(response: Response, _: bool = Depends(require_admin)):
-    response.delete_cookie(COOKIE_NAME)
-    return {"message": "已退出登录。"}
-@app.get("/admin/api/session")
-async def admin_session(_: bool = Depends(require_admin)):
-    return {"ok": True}
-@app.get("/admin/api/overview")
-async def admin_overview(_: bool = Depends(require_admin)):
-    conn = get_db_connection()
-    try:
-        total_models = conn.execute("SELECT COUNT(*) AS count FROM proxy_models").fetchone()["count"]
-        enabled_models = conn.execute("SELECT COUNT(*) AS count FROM proxy_models WHERE enabled = 1").fetchone()["count"]
-        total_keys = conn.execute("SELECT COUNT(*) AS count FROM api_keys").fetchone()["count"]
-        enabled_keys = conn.execute("SELECT COUNT(*) AS count FROM api_keys WHERE enabled = 1").fetchone()["count"]
-        usage = conn.execute("SELECT COALESCE(SUM(request_count), 0) AS total_requests, COALESCE(SUM(success_count), 0) AS total_success, COALESCE(SUM(failure_count), 0) AS total_failures FROM proxy_models").fetchone()
-        recent_rows = conn.execute("SELECT h.checked_at, h.ok, h.latency_ms, m.model_id FROM health_check_records h JOIN proxy_models m ON m.id = h.model_id ORDER BY h.checked_at DESC LIMIT 8").fetchall()
-        return {
-            "metrics": [
-                {"label": "Enabled Models", "value": enabled_models},
-                {"label": "Enabled Keys", "value": enabled_keys},
-                {"label": "Proxy Requests", "value": usage["total_requests"]},
-                {"label": "Failures", "value": usage["total_failures"]},
-            ],
-            "recent_checks": [{"time": row["checked_at"], "model": row["model_id"], "status": "healthy" if row["ok"] else "down", "latency": row["latency_ms"]} for row in recent_rows],
-            "totals": {
-                "total_models": total_models,
-                "enabled_models": enabled_models,
-                "total_keys": total_keys,
-                "enabled_keys": enabled_keys,
-                "total_requests": usage["total_requests"],
-                "total_success": usage["total_success"],
-                "total_failures": usage["total_failures"],
-            },
-        }
-    finally:
-        conn.close()
-@app.get("/admin/api/models")
-async def admin_models(_: bool = Depends(require_admin)):
-    conn = get_db_connection()
-    try:
-        rows = conn.execute("SELECT * FROM proxy_models ORDER BY sort_order ASC, model_id ASC").fetchall()
-        return {"items": [row_to_model_item(row) for row in rows]}
-    finally:
-        conn.close()
-@app.get("/admin/api/models/usage")
-async def admin_models_usage(_: bool = Depends(require_admin)):
-    conn = get_db_connection()
-    try:
-        rows = conn.execute("SELECT * FROM proxy_models ORDER BY request_count DESC, model_id ASC").fetchall()
-        return {"items": [row_to_model_item(row) for row in rows]}
     finally:
         conn.close()
-@app.post("/admin/api/models")
-async def admin_add_model(request: Request, _: bool = Depends(require_admin)):
-    body = await request.json()
-    model_id = (body.get("model_id") or body.get("name") or "").strip()
-    display_name = (body.get("display_name") or model_id).strip()
-    if not model_id:
-        raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="model_id is required.")
     conn = get_db_connection()
     try:
         now = utcnow_iso()
         conn.execute(
             """
-            INSERT INTO proxy_models (model_id, display_name, provider, description, enabled, featured, sort_order, created_at, updated_at)
-            VALUES (?, ?, 'nvidia-nim', ?, ?, ?, ?, ?, ?)
-            ON CONFLICT(model_id) DO UPDATE SET
-                display_name = excluded.display_name,
-                description = excluded.description,
-                enabled = excluded.enabled,
-                featured = excluded.featured,
-                sort_order = excluded.sort_order,
-                updated_at = excluded.updated_at
             """,
-            (model_id, display_name, body.get("description"), 1 if body.get("enabled", True) else 0, 1 if body.get("featured", False) else 0, int(body.get("sort_order", 0)), now, now),
         )
         conn.commit()
-        row = fetch_model_by_identifier(conn, model_id)
-        return {"item": row_to_model_item(row)}
     finally:
         conn.close()
-def delete_model_internal(model_identifier: str) -> dict[str, Any]:
     conn = get_db_connection()
     try:
-        row = fetch_model_by_identifier(conn, model_identifier)
-        if not row:
-            raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="未找到模型。")
-        conn.execute("DELETE FROM proxy_models WHERE id = ?", (row["id"],))
-        conn.commit()
-        return {"message": "??????"}
     finally:
         conn.close()
-@app.delete("/admin/api/models/{model_identifier}")
-async def admin_delete_model(model_identifier: str, _: bool = Depends(require_admin)):
-    return delete_model_internal(model_identifier)
-@app.post("/admin/api/models/remove")
-async def admin_remove_model_alias(request: Request, _: bool = Depends(require_admin)):
-    body = await request.json()
-    value = body.get("value") if isinstance(body, dict) else None
-    if not value:
-        raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="value is required.")
-    return delete_model_internal(str(value))
-async def test_model_internal(model_identifier: str, payload: dict[str, Any] | None = None) -> dict[str, Any]:
     conn = get_db_connection()
     try:
-        row = fetch_model_by_identifier(conn, model_identifier, enabled_only=True)
         if not row:
-            raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="未找到模型。")
-        api_key_row = await select_api_key(conn, payload.get("api_key_id") if payload else None)
-        return await perform_healthcheck(conn, row, api_key_row, (payload or {}).get("prompt") or DEFAULT_HEALTH_PROMPT)
-    finally:
-        conn.close()
-@app.post("/admin/api/models/test")
-async def admin_test_model_alias(request: Request, _: bool = Depends(require_admin)):
-    body = await request.json()
-    identifier = body.get("value") or body.get("model_id")
-    if not identifier:
-        raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="value is required.")
-    return await test_model_internal(str(identifier), body)
-@app.post("/admin/api/models/{model_identifier}/test")
-async def admin_test_model(model_identifier: str, request: Request, _: bool = Depends(require_admin)):
-    body = await request.json() if request.method == "POST" else {}
-    return await test_model_internal(model_identifier, body)
-@app.get("/admin/api/keys")
-async def admin_keys(_: bool = Depends(require_admin)):
-    conn = get_db_connection()
-    try:
-        rows = conn.execute("SELECT * FROM api_keys ORDER BY id ASC").fetchall()
-        return {"items": [row_to_key_item(row) for row in rows]}
-    finally:
-        conn.close()
-@app.get("/admin/api/keys/usage")
-async def admin_keys_usage(_: bool = Depends(require_admin)):
-    conn = get_db_connection()
-    try:
-        rows = conn.execute("SELECT * FROM api_keys ORDER BY request_count DESC, id ASC").fetchall()
-        return {"items": [row_to_key_item(row) for row in rows]}
     finally:
         conn.close()
-@app.post("/admin/api/keys")
-async def admin_add_key(request: Request, _: bool = Depends(require_admin)):
-    body = await request.json()
-    name = (body.get("name") or body.get("label") or "").strip()
-    api_key = (body.get("api_key") or body.get("key") or "").strip()
-    if not name or not api_key:
-        raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="Both name and api_key are required.")
     conn = get_db_connection()
     try:
-        now = utcnow_iso()
-        conn.execute(
-            """
-            INSERT INTO api_keys (name, api_key, enabled, created_at, updated_at)
-            VALUES (?, ?, ?, ?, ?)
-            ON CONFLICT(name) DO UPDATE SET api_key = excluded.api_key, enabled = excluded.enabled, updated_at = excluded.updated_at
-            """,
-            (name, api_key, 1 if body.get("enabled", True) else 0, now, now),
-        )
-        conn.commit()
-        row = fetch_key_by_identifier(conn, name)
-        return {"item": row_to_key_item(row)}
     finally:
         conn.close()
-def delete_key_internal(key_identifier: str) -> dict[str, Any]:
-    conn = get_db_connection()
-    try:
-        row = fetch_key_by_identifier(conn, key_identifier)
-        if not row:
-            raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="??? API Key?")
-        conn.execute("DELETE FROM api_keys WHERE id = ?", (row["id"],))
-        conn.commit()
-        return {"message": "API Key ????"}
-    finally:
-        conn.close()
-@app.delete("/admin/api/keys/{key_identifier}")
-async def admin_delete_key(key_identifier: str, _: bool = Depends(require_admin)):
-    return delete_key_internal(key_identifier)
-@app.post("/admin/api/keys/remove")
-async def admin_remove_key_alias(request: Request, _: bool = Depends(require_admin)):
-    body = await request.json()
-    value = body.get("value") if isinstance(body, dict) else None
-    if not value:
-        raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="value is required.")
-    return delete_key_internal(str(value))
-async def test_key_internal(key_identifier: str, payload: dict[str, Any] | None = None) -> dict[str, Any]:
-    conn = get_db_connection()
     try:
-        key_row = fetch_key_by_identifier(conn, key_identifier, enabled_only=True)
-        if not key_row:
-            raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="??? API Key?")
-        model_identifier = (payload or {}).get("model_id") or DEFAULT_MODELS[0][0]
-        model_row = fetch_model_by_identifier(conn, model_identifier, enabled_only=True)
-        if not model_row:
-            raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="??????")
-        return await perform_healthcheck(conn, model_row, key_row, (payload or {}).get("prompt") or DEFAULT_HEALTH_PROMPT)
-    finally:
-        conn.close()
-async def test_all_keys_internal(payload: dict[str, Any] | None = None) -> list[dict[str, Any]]:
-    conn = get_db_connection()
     try:
-        key_rows = conn.execute("SELECT * FROM api_keys WHERE enabled = 1 ORDER BY id ASC").fetchall()
-        if not key_rows:
-            raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="??????? API Key?")
-        model_identifier = (payload or {}).get("model_id") or DEFAULT_MODELS[0][0]
-        model_row = fetch_model_by_identifier(conn, model_identifier, enabled_only=True)
-        if not model_row:
-            raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="??????")
-        prompt = (payload or {}).get("prompt") or DEFAULT_HEALTH_PROMPT
-        results: list[dict[str, Any]] = []
-        for key_row in key_rows:
-            results.append(await perform_healthcheck(conn, model_row, key_row, prompt))
-        return results
     finally:
-        conn.close()
-@app.post("/admin/api/keys/test")
-async def admin_test_key_alias(request: Request, _: bool = Depends(require_admin)):
-    body = await request.json()
-    identifier = body.get("value") or body.get("name") or body.get("label")
-    if not identifier:
-        raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="value is required.")
-    return await test_key_internal(str(identifier), body)
-@app.post("/admin/api/keys/test-all")
-async def admin_test_all_keys(request: Request, _: bool = Depends(require_admin)):
-    body = await request.json() if request.method == "POST" else {}
-    results = await test_all_keys_internal(body)
-    return {"items": results, "results": results}
-@app.post("/admin/api/keys/{key_identifier}/test")
-async def admin_test_key(key_identifier: str, request: Request, _: bool = Depends(require_admin)):
-    body = await request.json() if request.method == "POST" else {}
-    return await test_key_internal(key_identifier, body)
-@app.get("/admin/api/healthchecks")
-async def admin_healthchecks(hours: int = 48, _: bool = Depends(require_admin)):
-    conn = get_db_connection()
-    try:
-        since = utcnow() - timedelta(hours=hours)
-        rows = conn.execute(
-            """
-            SELECT h.*, m.model_id, m.display_name, k.name AS key_name
-            FROM health_check_records h
-            JOIN proxy_models m ON m.id = h.model_id
-            LEFT JOIN api_keys k ON k.id = h.api_key_id
-            WHERE h.checked_at >= ?
-            ORDER BY h.checked_at DESC
-            LIMIT 200
-            """,
-            (since.isoformat(),),
-        ).fetchall()
-        items = [{"id": row["id"], "model": row["display_name"], "model_id": row["model_id"], "api_key": row["key_name"], "status": "healthy" if row["ok"] else "down", "detail": row["response_excerpt"] or row["error_message"] or "暂无详情。", "latency": row["latency_ms"], "status_code": row["status_code"], "checked_at": row["checked_at"]} for row in rows]
-        return {"items": items}
-    finally:
-        conn.close()
-@app.post("/admin/api/healthchecks/run")
-async def admin_run_healthchecks(request: Request, _: bool = Depends(require_admin)):
-    body = await request.json() if request.method == "POST" else {}
-    results = await run_healthchecks(model_identifier=body.get("model_id") or body.get("model"), api_key_identifier=body.get("api_key_id") or body.get("key_id"), prompt=body.get("prompt"))
-    return {"items": results, "results": results}
-@app.get("/admin/api/settings")
-async def admin_settings(_: bool = Depends(require_admin)):
-    conn = get_db_connection()
-    try:
-        return get_settings_payload(conn)
-    finally:
-        conn.close()
-@app.put("/admin/api/settings")
-async def admin_update_settings(request: Request, _: bool = Depends(require_admin)):
     body = await request.json()
-    conn = get_db_connection()
-    try:
-        set_setting(conn, "healthcheck_enabled", "true" if body.get("healthcheck_enabled", True) else "false")
-        set_setting(conn, "healthcheck_interval_minutes", str(max(5, int(body.get("healthcheck_interval_minutes", DEFAULT_HEALTH_INTERVAL_MINUTES)))))
-        set_setting(conn, "healthcheck_prompt", body.get("healthcheck_prompt") or DEFAULT_HEALTH_PROMPT)
-        if body.get("public_history_hours"):
-            set_setting(conn, "public_history_hours", str(max(1, int(body.get("public_history_hours")))))
-        conn.commit()
-    finally:
-        conn.close()
-    schedule_healthchecks()
-    conn = get_db_connection()
     try:
-        return get_settings_payload(conn)
-    finally:
-        conn.close()

+from __future__ import annotations
 import asyncio
+import hashlib
 import json
 import os
 import sqlite3
 from typing import Any
 import httpx
+from fastapi import Depends, FastAPI, Header, HTTPException, Request, status
+from fastapi.middleware.gzip import GZipMiddleware
+from fastapi.responses import HTMLResponse, StreamingResponse
 from fastapi.staticfiles import StaticFiles
 BASE_DIR = Path(__file__).resolve().parent.parent
 NVIDIA_API_BASE = RAW_NVIDIA_API_BASE if RAW_NVIDIA_API_BASE.endswith("/v1") else f"{RAW_NVIDIA_API_BASE}/v1"
 CHAT_COMPLETIONS_URL = f"{NVIDIA_API_BASE}/chat/completions"
 MODELS_URL = f"{NVIDIA_API_BASE}/models"
 REQUEST_TIMEOUT_SECONDS = float(os.getenv("REQUEST_TIMEOUT_SECONDS", "90"))
+MAX_UPSTREAM_CONNECTIONS = int(os.getenv("MAX_UPSTREAM_CONNECTIONS", "512"))
+MAX_KEEPALIVE_CONNECTIONS = int(os.getenv("MAX_KEEPALIVE_CONNECTIONS", "128"))
+MODEL_SYNC_INTERVAL_MINUTES = int(os.getenv("MODEL_SYNC_INTERVAL_MINUTES", "30"))
+PUBLIC_HISTORY_BUCKETS = int(os.getenv("PUBLIC_HISTORY_BUCKETS", "6"))
+BUCKET_MINUTES = 10
+DEFAULT_MONITORED_MODELS = "z-ai/glm5,z-ai/glm4.7,minimaxai/minimax-m2.5,minimaxai/minimax-m2.7,moonshotai/kimi-k2.5,deepseek-ai/deepseek-v3.2,google/gemma-4-31b-it,qwen/qwen3.5-397b-a17b"
+MODEL_LIST = [item.strip() for item in os.getenv("MODEL_LIST", DEFAULT_MONITORED_MODELS).split(",") if item.strip()]
 http_client: httpx.AsyncClient | None = None
+model_cache: list[dict[str, Any]] = []
+model_cache_synced_at: str | None = None
+model_cache_lock: asyncio.Lock | None = None
+model_sync_task: asyncio.Task[None] | None = None
 def utcnow() -> datetime:
     return utcnow().isoformat()
+def json_dumps(value: Any) -> str:
+    return json.dumps(value, ensure_ascii=False)
+def hash_api_key(api_key: str) -> str:
+    return hashlib.sha256(api_key.encode("utf-8")).hexdigest()
+def normalize_provider(model_id: str, owned_by: str | None = None) -> str:
+    if owned_by:
+        return owned_by
+    if "/" in model_id:
+        return model_id.split("/", 1)[0]
+    return "unknown"
+def bucket_start(dt: datetime | None = None) -> datetime:
+    dt = dt or utcnow()
+    minute = dt.minute - (dt.minute % BUCKET_MINUTES)
+    return dt.replace(minute=minute, second=0, microsecond=0)
+def bucket_label(value: str) -> str:
+    try:
+        dt = datetime.fromisoformat(value)
+    except ValueError:
+        return value
+    return dt.strftime("%H:%M")
 def get_db_connection() -> sqlite3.Connection:
+    DB_PATH.parent.mkdir(parents=True, exist_ok=True)
     conn = sqlite3.connect(DB_PATH, check_same_thread=False, timeout=30.0)
     conn.row_factory = sqlite3.Row
     conn.execute("PRAGMA journal_mode=WAL")
 def init_db() -> None:
     conn = get_db_connection()
     try:
         conn.executescript(
             """
             CREATE TABLE IF NOT EXISTS response_records (
+                response_id TEXT PRIMARY KEY,
+                api_key_hash TEXT NOT NULL,
                 parent_response_id TEXT,
+                model_id TEXT NOT NULL,
                 request_json TEXT NOT NULL,
                 input_items_json TEXT NOT NULL,
                 output_json TEXT NOT NULL,
                 output_items_json TEXT NOT NULL,
                 status TEXT NOT NULL,
+                success INTEGER NOT NULL,
+                latency_ms REAL,
+                error_message TEXT,
                 created_at TEXT NOT NULL
             );
+            CREATE INDEX IF NOT EXISTS idx_response_api_hash ON response_records(api_key_hash);
+            CREATE INDEX IF NOT EXISTS idx_response_parent ON response_records(parent_response_id);
+            CREATE INDEX IF NOT EXISTS idx_response_model_created ON response_records(model_id, created_at);
+            CREATE TABLE IF NOT EXISTS metric_buckets (
+                bucket_start TEXT NOT NULL,
+                model_id TEXT NOT NULL,
+                total_count INTEGER NOT NULL DEFAULT 0,
+                success_count INTEGER NOT NULL DEFAULT 0,
+                total_latency_ms REAL NOT NULL DEFAULT 0,
+                PRIMARY KEY(bucket_start, model_id)
             );
+            CREATE TABLE IF NOT EXISTS gateway_totals (
+                id INTEGER PRIMARY KEY CHECK(id = 1),
+                total_requests INTEGER NOT NULL DEFAULT 0,
+                total_success INTEGER NOT NULL DEFAULT 0,
+                total_latency_ms REAL NOT NULL DEFAULT 0,
+                updated_at TEXT NOT NULL
+            );
+            CREATE TABLE IF NOT EXISTS official_models_cache (
+                id TEXT PRIMARY KEY,
+                object TEXT NOT NULL,
+                created INTEGER,
+                owned_by TEXT,
+                synced_at TEXT NOT NULL
             );
             """
         )
+        conn.execute(
+            """
+            INSERT OR IGNORE INTO gateway_totals (id, total_requests, total_success, total_latency_ms, updated_at)
+            VALUES (1, 0, 0, 0, ?)
+            """,
+            (utcnow_iso(),),
+        )
         conn.commit()
     finally:
         conn.close()
+async def run_db(fn, *args, **kwargs):
+    return await asyncio.to_thread(fn, *args, **kwargs)
+async def get_http_client() -> httpx.AsyncClient:
+    global http_client
+    if http_client is None or http_client.is_closed:
+        limits = httpx.Limits(
+            max_connections=MAX_UPSTREAM_CONNECTIONS,
+            max_keepalive_connections=MAX_KEEPALIVE_CONNECTIONS,
+        )
+        http_client = httpx.AsyncClient(timeout=REQUEST_TIMEOUT_SECONDS, limits=limits)
+    return http_client
+async def get_model_cache_lock() -> asyncio.Lock:
+    global model_cache_lock
+    if model_cache_lock is None:
+        model_cache_lock = asyncio.Lock()
+    return model_cache_lock
+def load_cached_models_from_db() -> tuple[list[dict[str, Any]], str | None]:
+    conn = get_db_connection()
+    try:
+        rows = conn.execute(
+            "SELECT id, object, created, owned_by, synced_at FROM official_models_cache ORDER BY id ASC"
+        ).fetchall()
+        if not rows:
+            return [], None
+        synced_at = rows[0]["synced_at"]
+        models = [
+            {
+                "id": row["id"],
+                "object": row["object"],
+                "created": row["created"],
+                "owned_by": row["owned_by"],
+            }
+            for row in rows
+        ]
+        return models, synced_at
+    finally:
+        conn.close()
+def save_models_to_db(models: list[dict[str, Any]], synced_at: str) -> None:
+    unique_models: dict[str, dict[str, Any]] = {}
+    for model in models:
+        model_id = model.get("id")
+        if model_id:
+            unique_models[model_id] = model
+    conn = get_db_connection()
     try:
+        conn.execute("DELETE FROM official_models_cache")
+        conn.executemany(
+            """
+            INSERT INTO official_models_cache (id, object, created, owned_by, synced_at)
+            VALUES (?, ?, ?, ?, ?)
+            """,
+            [
+                (
+                    model_id,
+                    model.get("object", "model"),
+                    model.get("created"),
+                    model.get("owned_by") or normalize_provider(model_id),
+                    synced_at,
+                )
+                for model_id, model in sorted(unique_models.items(), key=lambda item: item[0])
+            ],
+        )
+        conn.commit()
+    finally:
+        conn.close()
+async def refresh_official_models(force: bool = False) -> list[dict[str, Any]]:
+    global model_cache, model_cache_synced_at
+    if model_cache and not force:
+        return model_cache
+    lock = await get_model_cache_lock()
+    async with lock:
+        if model_cache and not force:
+            return model_cache
+        client = await get_http_client()
+        response = await client.get(MODELS_URL, headers={"Accept": "application/json"})
+        response.raise_for_status()
+        payload = response.json()
+        models = payload.get("data") or payload.get("models") or []
+        normalized = [
+            {
+                "id": item.get("id"),
+                "object": item.get("object", "model"),
+                "created": item.get("created"),
+                "owned_by": item.get("owned_by") or normalize_provider(item.get("id", "")),
+            }
+            for item in models
+            if isinstance(item, dict) and item.get("id")
+        ]
+        synced_at = utcnow_iso()
+        await run_db(save_models_to_db, normalized, synced_at)
+        model_cache = normalized
+        model_cache_synced_at = synced_at
+        return normalized
+async def model_sync_loop() -> None:
+    while True:
+        try:
+            await refresh_official_models(force=True)
+        except Exception:
+            pass
+        await asyncio.sleep(max(300, MODEL_SYNC_INTERVAL_MINUTES * 60))
+def extract_user_api_key(
+    authorization: str | None = Header(default=None),
+    x_api_key: str | None = Header(default=None),
+    x_nvidia_api_key: str | None = Header(default=None),
+) -> str:
     token: str | None = None
     if authorization and authorization.startswith("Bearer "):
         token = authorization.removeprefix("Bearer ").strip()
     elif x_api_key:
         token = x_api_key.strip()
+    elif x_nvidia_api_key:
+        token = x_nvidia_api_key.strip()
     if not token:
+        raise HTTPException(status_code=status.HTTP_401_UNAUTHORIZED, detail="请通过 Authorization Bearer 或 X-API-Key 提供你的 NIM Key。")
+    return token
 def normalize_content(content: Any, role: str) -> list[dict[str, Any]]:
     if content is None:
         if not isinstance(item, dict):
             items.append({"type": "message", "role": "user", "content": [{"type": "input_text", "text": str(item)}]})
             continue
         item_type = item.get("type")
         if item_type == "message" or item.get("role"):
             role = item.get("role", "user")
             arguments = item.get("arguments", "{}")
             if not isinstance(arguments, str):
                 arguments = json_dumps(arguments)
+            items.append({
+                "type": "function_call",
+                "call_id": item.get("call_id") or f"call_{uuid.uuid4().hex[:12]}",
+                "name": item.get("name"),
+                "arguments": arguments,
+            })
             continue
         if item_type in {"input_text", "output_text", "text"}:
             items.append({"type": "message", "role": "user", "content": [{"type": "input_text", "text": item.get("text", "")}]})
     return str(content)
 def items_to_chat_messages(items: list[dict[str, Any]]) -> list[dict[str, Any]]:
     messages: list[dict[str, Any]] = []
     pending_tool_calls: list[dict[str, Any]] = []
                 arguments = part.get("arguments") or "{}"
                 if not isinstance(arguments, str):
                     arguments = json_dumps(arguments)
+                tool_calls.append({
+                    "id": part.get("id") or part.get("call_id") or f"call_{uuid.uuid4().hex[:12]}",
+                    "name": part.get("name"),
+                    "arguments": arguments,
+                })
     for tool_call in message.get("tool_calls") or []:
         if not isinstance(tool_call, dict):
         arguments = function_data.get("arguments") or tool_call.get("arguments") or "{}"
         if not isinstance(arguments, str):
             arguments = json_dumps(arguments)
+        tool_calls.append(
+            {
+                "id": tool_call.get("id") or f"call_{uuid.uuid4().hex[:12]}",
+                "name": function_data.get("name") or tool_call.get("name"),
+                "arguments": arguments,
+            }
+        )
     deduped: list[dict[str, Any]] = []
     seen_ids: set[str] = set()
         deduped.append(tool_call)
     return "\n".join(filter(None, text_chunks)).strip(), deduped
 def build_choice_alias(output_items: list[dict[str, Any]], finish_reason: str | None) -> list[dict[str, Any]]:
     content_parts: list[dict[str, Any]] = []
     for item in output_items:
     response_id = upstream_json.get("id") or f"resp_{uuid.uuid4().hex}"
     output_items: list[dict[str, Any]] = []
     if assistant_text:
+        output_items.append({
+            "id": f"msg_{uuid.uuid4().hex[:24]}",
+            "type": "message",
+            "status": "completed",
+            "role": "assistant",
+            "content": [{"type": "output_text", "text": assistant_text, "annotations": []}],
+        })
     for tool_call in tool_calls:
+        output_items.append({
+            "id": f"fc_{uuid.uuid4().hex[:24]}",
+            "type": "function_call",
+            "status": "completed",
+            "call_id": tool_call["id"],
+            "name": tool_call.get("name"),
+            "arguments": tool_call.get("arguments", "{}"),
+        })
     usage = upstream_json.get("usage") or {}
     return {
         "id": response_id,
         "previous_response_id": previous_response_id,
         "store": True,
         "text": body.get("text") or {"format": {"type": "text"}},
+        "usage": {
+            "input_tokens": usage.get("prompt_tokens"),
+            "output_tokens": usage.get("completion_tokens"),
+            "total_tokens": usage.get("total_tokens"),
+        },
         "choices": build_choice_alias(output_items, finish_reason),
+        "upstream": {
+            "id": upstream_json.get("id"),
+            "object": upstream_json.get("object", "chat.completion"),
+            "finish_reason": finish_reason or "stop",
+        },
     }
+def store_success_record(api_key_hash: str, model_id: str, request_body: dict[str, Any], input_items: list[dict[str, Any]], response_payload: dict[str, Any], latency_ms: float) -> None:
+    conn = get_db_connection()
+    try:
+        now = utcnow_iso()
+        bucket = bucket_start().isoformat()
         conn.execute(
             """
+            INSERT OR REPLACE INTO response_records (
+                response_id, api_key_hash, parent_response_id, model_id, request_json,
+                input_items_json, output_json, output_items_json, status, success,
+                latency_ms, error_message, created_at
+            ) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
             """,
+            (
+                response_payload["id"],
+                api_key_hash,
+                request_body.get("previous_response_id"),
+                model_id,
+                json_dumps(request_body),
+                json_dumps(input_items),
+                json_dumps(response_payload),
+                json_dumps(response_payload.get("output") or []),
+                response_payload.get("status", "completed"),
+                1,
+                latency_ms,
+                None,
+                now,
+            ),
         )
         conn.execute(
             """
+            INSERT INTO metric_buckets (bucket_start, model_id, total_count, success_count, total_latency_ms)
+            VALUES (?, ?, 1, 1, ?)
+            ON CONFLICT(bucket_start, model_id) DO UPDATE SET
+                total_count = total_count + 1,
+                success_count = success_count + 1,
+                total_latency_ms = total_latency_ms + excluded.total_latency_ms
+            """,
+            (bucket, model_id, latency_ms),
+        )
+        conn.execute(
+            """
+            UPDATE gateway_totals
+            SET total_requests = total_requests + 1,
+                total_success = total_success + 1,
+                total_latency_ms = total_latency_ms + ?,
                 updated_at = ?
+            WHERE id = 1
             """,
+            (latency_ms, now),
         )
         conn.commit()
     finally:
         conn.close()
+def store_failure_metric(model_id: str, error_message: str) -> None:
     conn = get_db_connection()
     try:
         now = utcnow_iso()
+        bucket = bucket_start().isoformat()
+        conn.execute(
+            """
+            INSERT INTO metric_buckets (bucket_start, model_id, total_count, success_count, total_latency_ms)
+            VALUES (?, ?, 1, 0, 0)
+            ON CONFLICT(bucket_start, model_id) DO UPDATE SET
+                total_count = total_count + 1
+            """,
+            (bucket, model_id),
+        )
         conn.execute(
             """
+            UPDATE gateway_totals
+            SET total_requests = total_requests + 1,
+                updated_at = ?
+            WHERE id = 1
             """,
+            (now,),
         )
         conn.commit()
     finally:
         conn.close()
+def load_previous_conversation_items(api_key_hash: str, previous_response_id: str | None) -> list[dict[str, Any]]:
+    if not previous_response_id:
+        return []
     conn = get_db_connection()
     try:
+        items: list[dict[str, Any]] = []
+        current = previous_response_id
+        chain: list[sqlite3.Row] = []
+        while current:
+            row = conn.execute(
+                "SELECT * FROM response_records WHERE response_id = ? AND api_key_hash = ?",
+                (current, api_key_hash),
+            ).fetchone()
+            if not row:
+                raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail=f"previous_response_id '{current}' 不存在，或不属于当前 Key。")
+            chain.append(row)
+            current = row["parent_response_id"]
+        for row in reversed(chain):
+            items.extend(json.loads(row["input_items_json"]))
+            items.extend(json.loads(row["output_items_json"]))
+        return items
     finally:
         conn.close()
+def load_response_record(api_key_hash: str, response_id: str) -> dict[str, Any]:
     conn = get_db_connection()
     try:
+        row = conn.execute(
+            "SELECT output_json FROM response_records WHERE response_id = ? AND api_key_hash = ?",
+            (response_id, api_key_hash),
+        ).fetchone()
         if not row:
+            raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="未找到对应响应，或当前 Key 无权访问。")
+        return json.loads(row["output_json"])
     finally:
         conn.close()
+def load_dashboard_data() -> dict[str, Any]:
     conn = get_db_connection()
     try:
+        totals_row = conn.execute("SELECT * FROM gateway_totals WHERE id = 1").fetchone()
+        total_requests = totals_row["total_requests"] if totals_row else 0
+        now_bucket = bucket_start()
+        bucket_points = [(now_bucket - timedelta(minutes=BUCKET_MINUTES * offset)).isoformat() for offset in reversed(range(PUBLIC_HISTORY_BUCKETS))]
+        placeholders = ",".join("?" for _ in MODEL_LIST) if MODEL_LIST else "''"
+        totals_by_model = {
+            row["model_id"]: row["total_count"]
+            for row in conn.execute(
+                f"SELECT model_id, COALESCE(SUM(total_count), 0) AS total_count FROM metric_buckets WHERE model_id IN ({placeholders}) GROUP BY model_id",
+                MODEL_LIST,
+            ).fetchall()
+        } if MODEL_LIST else {}
+        since = bucket_points[0] if bucket_points else utcnow_iso()
+        recent_rows = conn.execute(
+            f"SELECT bucket_start, model_id, total_count, success_count FROM metric_buckets WHERE model_id IN ({placeholders}) AND bucket_start >= ? ORDER BY bucket_start ASC",
+            [*MODEL_LIST, since],
+        ).fetchall() if MODEL_LIST else []
+        row_map: dict[str, dict[str, sqlite3.Row]] = {}
+        for row in recent_rows:
+            row_map.setdefault(row["model_id"], {})[row["bucket_start"]] = row
+        models: list[dict[str, Any]] = []
+        latest_rates: list[float] = []
+        for model_id in MODEL_LIST:
+            points: list[dict[str, Any]] = []
+            latest_rate: float | None = None
+            for bucket_value in bucket_points:
+                row = row_map.get(model_id, {}).get(bucket_value)
+                total_count = row["total_count"] if row else 0
+                success_count = row["success_count"] if row else 0
+                success_rate = round((success_count / total_count) * 100, 1) if total_count else None
+                points.append(
+                    {
+                        "bucket_start": bucket_value,
+                        "label": bucket_label(bucket_value),
+                        "total_count": total_count,
+                        "success_count": success_count,
+                        "success_rate": success_rate,
+                    }
+                )
+                if total_count:
+                    latest_rate = success_rate
+            if latest_rate is not None:
+                latest_rates.append(latest_rate)
+            average_rate = None
+            non_empty = [point["success_rate"] for point in points if point["success_rate"] is not None]
+            if non_empty:
+                average_rate = round(sum(non_empty) / len(non_empty), 1)
+            models.append(
+                {
+                    "model_id": model_id,
+                    "provider": normalize_provider(model_id),
+                    "total_calls": totals_by_model.get(model_id, 0),
+                    "latest_success_rate": latest_rate,
+                    "average_success_rate": average_rate,
+                    "points": points,
+                }
+            )
+        average_health = round(sum(latest_rates) / len(latest_rates), 1) if latest_rates else None
+        return {
+            "generated_at": utcnow_iso(),
+            "bucket_minutes": BUCKET_MINUTES,
+            "total_requests": total_requests,
+            "average_health": average_health,
+            "models": models,
+        }
     finally:
         conn.close()
+def build_catalog_payload() -> dict[str, Any]:
+    grouped: dict[str, list[dict[str, Any]]] = {}
+    for model in sorted(model_cache, key=lambda item: item.get("id", "")):
+        provider = normalize_provider(model.get("id", ""), model.get("owned_by"))
+        grouped.setdefault(provider, []).append(model)
+    providers = [
+        {
+            "provider": provider,
+            "count": len(items),
+            "models": items,
+        }
+        for provider, items in sorted(grouped.items(), key=lambda entry: entry[0].lower())
+    ]
+    return {
+        "generated_at": utcnow_iso(),
+        "synced_at": model_cache_synced_at,
+        "total_models": len(model_cache),
+        "providers": providers,
+    }
+async def post_nvidia_chat_completion(api_key: str, payload: dict[str, Any]) -> tuple[dict[str, Any], float]:
+    client = await get_http_client()
+    started = time.perf_counter()
+    response = await client.post(
+        CHAT_COMPLETIONS_URL,
+        headers={"Authorization": f"Bearer {api_key}", "Content-Type": "application/json", "Accept": "application/json"},
+        json=payload,
+    )
+    latency_ms = round((time.perf_counter() - started) * 1000, 2)
+    if response.status_code >= 400:
+        try:
+            error_payload = response.json()
+            detail = error_payload.get("error", {}).get("message") or json_dumps(error_payload)
+        except Exception:
+            detail = response.text
+        raise HTTPException(status_code=response.status_code, detail=f"NVIDIA NIM 请求失败：{detail}")
+    return response.json(), latency_ms
+def render_html(filename: str) -> HTMLResponse:
+    content = (STATIC_DIR / filename).read_text(encoding="utf-8")
+    return HTMLResponse(content=content, media_type="text/html; charset=utf-8")
+@asynccontextmanager
+async def lifespan(_app: FastAPI):
+    global model_cache, model_cache_synced_at, model_sync_task, http_client, model_cache_lock
+    init_db()
+    cached_models, cached_synced_at = await run_db(load_cached_models_from_db)
+    model_cache = cached_models
+    model_cache_synced_at = cached_synced_at
+    model_cache_lock = asyncio.Lock()
+    http_client = await get_http_client()
     try:
+        await refresh_official_models(force=not bool(model_cache))
+    except Exception:
+        pass
+    model_sync_task = asyncio.create_task(model_sync_loop())
     try:
+        yield
     finally:
+        if model_sync_task is not None:
+            model_sync_task.cancel()
+            with contextlib.suppress(asyncio.CancelledError):
+                await model_sync_task
+        if http_client is not None and not http_client.is_closed:
+            await http_client.aclose()
+        http_client = None
+        model_sync_task = None
+        model_cache_lock = None
+app = FastAPI(title="NIM Responses Gateway", lifespan=lifespan)
+app.add_middleware(GZipMiddleware, minimum_size=1000)
+app.mount("/static", StaticFiles(directory=str(STATIC_DIR)), name="static")
+@app.get("/", response_class=HTMLResponse)
+async def homepage() -> HTMLResponse:
+    return render_html("index.html")
+@app.get("/api/dashboard")
+async def dashboard_api() -> dict[str, Any]:
+    return await run_db(load_dashboard_data)
+@app.get("/api/catalog")
+async def catalog_api() -> dict[str, Any]:
+    if not model_cache:
+        try:
+            await refresh_official_models(force=True)
+        except Exception:
+            pass
+    return build_catalog_payload()
+@app.get("/v1/models")
+async def list_models() -> dict[str, Any]:
+    if not model_cache:
+        await refresh_official_models(force=True)
+    return {"object": "list", "data": model_cache}
+@app.get("/v1/responses/{response_id}")
+async def get_response(response_id: str, api_key: str = Depends(extract_user_api_key)) -> dict[str, Any]:
+    return await run_db(load_response_record, hash_api_key(api_key), response_id)
+@app.post("/v1/responses")
+async def create_response(request: Request, api_key: str = Depends(extract_user_api_key)):
     body = await request.json()
+    if not isinstance(body, dict):
+        raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="请求体必须是 JSON 对象。")
+    if not body.get("model"):
+        raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="缺少 model 字段。")
+    if body.get("input") is None:
+        raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="缺少 input 字段。")
+    api_key_hash = hash_api_key(api_key)
+    input_items = normalize_input_items(body.get("input"))
+    previous_items = await run_db(load_previous_conversation_items, api_key_hash, body.get("previous_response_id"))
+    merged_items = previous_items + input_items
+    chat_payload = build_chat_payload(body, merged_items)
     try:
+        upstream_json, latency_ms = await post_nvidia_chat_completion(api_key, chat_payload)
+    except HTTPException as exc:
+        await run_db(store_failure_metric, body.get("model"), exc.detail)
+        raise exc
+    response_payload = chat_completion_to_response(body, upstream_json, body.get("previous_response_id"))
+    await run_db(store_success_record, api_key_hash, body.get("model"), body, input_items, response_payload, latency_ms)
+    if body.get("stream"):
+        async def event_stream() -> Any:
+            yield f"event: response.created\ndata: {json_dumps({'type': 'response.created', 'response': {'id': response_payload['id'], 'model': response_payload['model'], 'status': 'in_progress'}})}\n\n"
+            for index, item in enumerate(response_payload.get("output") or []):
+                yield f"event: response.output_item.added\ndata: {json_dumps({'type': 'response.output_item.added', 'output_index': index, 'item': item})}\n\n"
+                if item.get("type") == "message":
+                    text_value = extract_text_from_content(item.get("content"))
+                    if text_value:
+                        yield f"event: response.output_text.delta\ndata: {json_dumps({'type': 'response.output_text.delta', 'output_index': index, 'delta': text_value})}\n\n"
+                        yield f"event: response.output_text.done\ndata: {json_dumps({'type': 'response.output_text.done', 'output_index': index, 'text': text_value})}\n\n"
+                if item.get("type") == "function_call":
+                    yield f"event: response.function_call_arguments.done\ndata: {json_dumps({'type': 'response.function_call_arguments.done', 'output_index': index, 'arguments': item.get('arguments', '{}'), 'call_id': item.get('call_id')})}\n\n"
+                yield f"event: response.output_item.done\ndata: {json_dumps({'type': 'response.output_item.done', 'output_index': index, 'item': item})}\n\n"
+            yield f"event: response.completed\ndata: {json_dumps({'type': 'response.completed', 'response': response_payload})}\n\n"
+        return StreamingResponse(event_stream(), media_type="text/event-stream")
+    return response_payload

requirements.txt CHANGED Viewed

@@ -1,6 +1,3 @@
-fastapi>=0.116.0,<1.0.0
 uvicorn[standard]>=0.35.0,<1.0.0
-httpx>=0.28.1,<1.0.0
-apscheduler>=3.10.4,<4.0.0
-python-multipart>=0.0.20,<1.0.0
-itsdangerous>=2.2.0,<3.0.0

+fastapi>=0.116.0,<1.0.0
 uvicorn[standard]>=0.35.0,<1.0.0
+httpx>=0.28.1,<1.0.0

static/index.html CHANGED Viewed

@@ -4,7 +4,7 @@
     <meta charset="UTF-8" />
     <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
     <meta name="viewport" content="width=device-width, initial-scale=1" />
-    <title>NVIDIA NIM 模型健康看板</title>
     <link rel="preconnect" href="https://fonts.googleapis.com" />
     <link rel="preconnect" href="https://fonts.gstatic.com" crossorigin />
     <link
@@ -13,52 +13,88 @@
     />
     <link rel="stylesheet" href="/static/style.css" />
   </head>
-  <body class="public-body">
     <div class="ambient ambient-left"></div>
     <div class="ambient ambient-right"></div>
-    <main class="public-shell">
-      <section class="hero-panel">
-        <div class="hero-copy">
-          <span class="hero-badge">NVIDIA NIM 网关</span>
-          <h1>模型健康度看板</h1>
-          <p>
-            公开页面只展示健康状态。系统按小时定时调用 NVIDIA NIM，
-            记录模型能否正常响应、最近一次时延，以及过去几个小时的稳定性走势。
-          </p>
-        </div>
-        <div class="hero-side">
-          <div class="hero-kicker">监控视图</div>
-          <div class="hero-value">小时级可用性</div>
-          <p>由后台巡检任务驱动，支持管理员扩展模型列表和更换巡检 Key。</p>
-        </div>
-      </section>
-      <section class="summary-panel">
-        <div class="section-heading">
-          <div>
-            <span class="section-tag">公开状态页</span>
-            <h2>最近 12 次巡检趋势</h2>
-          </div>
-          <div class="refresh-meta">
-            <span>最近刷新</span>
-            <strong id="last-updated">--</strong>
-          </div>
-        </div>
-        <div class="summary-strip" id="summary-chips"></div>
-      </section>
-      <section class="board-panel">
-        <div class="board-head">
-          <div>
-            <span class="section-tag">模型矩阵</span>
-            <h2>健康状态总览</h2>
-          </div>
-          <p class="board-note">绿色表示正常，橙色表示波动，红色表示异常，灰色表示尚未巡检。</p>
-        </div>
-        <div class="model-grid" id="model-grid"></div>
-        <p class="status-text error-text" id="error-text"></p>
-      </section>
     </main>
     <script src="/static/public.js" charset="utf-8" defer></script>
   </body>
 </html>

     <meta charset="UTF-8" />
     <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
     <meta name="viewport" content="width=device-width, initial-scale=1" />
+    <title>NIM 模型健康与模型库</title>
     <link rel="preconnect" href="https://fonts.googleapis.com" />
     <link rel="preconnect" href="https://fonts.gstatic.com" crossorigin />
     <link
     />
     <link rel="stylesheet" href="/static/style.css" />
   </head>
+  <body class="showcase-body">
     <div class="ambient ambient-left"></div>
     <div class="ambient ambient-right"></div>
+    <div class="ambient ambient-bottom"></div>
+    <header class="topbar">
+      <div class="brand-mark">
+        <span class="brand-badge">NIM Responses Gateway</span>
+        <strong>模型健康与官方模型库</strong>
+      </div>
+      <nav class="page-nav">
+        <button class="page-nav-btn active" data-page="0">健康看板</button>
+        <button class="page-nav-btn" data-page="1">官方模型库</button>
+      </nav>
+    </header>
+    <main class="showcase-shell">
+      <div class="page-track" id="page-track">
+        <section class="page-section active" data-page="0">
+          <section class="hero-stage">
+            <div class="hero-copy">
+              <span class="hero-badge">每 10 分钟统计</span>
+              <h1>实时观察模型调用质量</h1>
+              <p>
+                本网关不保存任何用户 NIM Key。每次调用都由用户自带 NIM 密钥直连上游，
+                页面仅展示聚合后的调用次数、成功率和官方模型目录。
+              </p>
+              <div class="hero-actions">
+                <button class="primary-btn" data-jump="1">查看官方模型库</button>
+                <button class="ghost-btn" id="refresh-dashboard" type="button">刷新数据</button>
+              </div>
+            </div>
+            <div class="hero-metrics" id="overview-cards"></div>
+          </section>
+          <section class="panel-shell">
+            <div class="section-head">
+              <div>
+                <span class="section-tag">健康看板</span>
+                <h2>监控模型 10 分钟成功率</h2>
+              </div>
+              <div class="section-meta">
+                <span>最近更新</span>
+                <strong id="dashboard-updated">--</strong>
+              </div>
+            </div>
+            <div class="health-grid" id="health-grid"></div>
+            <p class="panel-hint" id="dashboard-empty"></p>
+          </section>
+        </section>
+        <section class="page-section" data-page="1">
+          <section class="hero-stage catalog-stage">
+            <div class="hero-copy narrow">
+              <span class="hero-badge">官方目录同步</span>
+              <h1>来自 NVIDIA 官方的模型列表</h1>
+              <p>
+                网关会定时从官方 `https://integrate.api.nvidia.com/v1/models` 拉取模型目录，
+                并按照模型 ID 对应的提供商进行分类展示。
+              </p>
+            </div>
+            <div class="catalog-summary" id="catalog-summary"></div>
+          </section>
+          <section class="panel-shell">
+            <div class="section-head">
+              <div>
+                <span class="section-tag">模型目录</span>
+                <h2>按提供商分类</h2>
+              </div>
+              <div class="section-meta">
+                <span>同步时间</span>
+                <strong id="catalog-updated">--</strong>
+              </div>
+            </div>
+            <div class="provider-grid" id="provider-grid"></div>
+            <p class="panel-hint" id="catalog-empty"></p>
+          </section>
+        </section>
+      </div>
     </main>
     <script src="/static/public.js" charset="utf-8" defer></script>
   </body>
 </html>

static/public.js CHANGED Viewed

@@ -1,20 +1,22 @@
-const summaryChips = document.getElementById("summary-chips");
-const modelGrid = document.getElementById("model-grid");
-const lastUpdated = document.getElementById("last-updated");
-const errorText = document.getElementById("error-text");
-const STATUS_LABELS = {
-  healthy: "正常",
-  degraded: "波动",
-  down: "异常",
-  unknown: "未巡检",
-};
-const STATUS_CLASS = {
-  healthy: "ok",
-  degraded: "warn",
-  down: "down",
-  unknown: "idle",
 };
 const dateTimeFormatter = new Intl.DateTimeFormat("zh-CN", {
@@ -24,6 +26,9 @@ const dateTimeFormatter = new Intl.DateTimeFormat("zh-CN", {
   minute: "2-digit",
 });
 function formatDateTime(value) {
   if (!value) return "--";
   const date = new Date(value);
@@ -31,103 +36,190 @@ function formatDateTime(value) {
   return dateTimeFormatter.format(date);
 }
-function formatHourSegment(segment) {
-  const span = document.createElement("span");
-  const date = new Date(segment.time || segment.hour);
-  span.textContent = Number.isNaN(date.getTime()) ? "--" : String(date.getHours()).padStart(2, "0");
-  span.className = `timeline-item ${STATUS_CLASS[segment.status] || "idle"}`;
-  span.title = `${STATUS_LABELS[segment.status] || "未巡检"} ${formatDateTime(segment.time || segment.hour)}`;
-  return span;
 }
-function createSummaryChip(label, value, tone = "default") {
-  const chip = document.createElement("div");
-  chip.className = `summary-chip ${tone}`;
-  chip.innerHTML = `<span>${label}</span><strong>${value}</strong>`;
-  return chip;
 }
-function renderSummary(models) {
-  summaryChips.innerHTML = "";
-  const total = models.length;
-  const healthy = models.filter((item) => item.status === "healthy").length;
-  const issues = models.filter((item) => item.status === "down").length;
-  const latest = models.reduce((max, item) => {
-    if (!item.last_healthcheck_at) return max;
-    return !max || new Date(item.last_healthcheck_at) > new Date(max) ? item.last_healthcheck_at : max;
-  }, null);
-  summaryChips.appendChild(createSummaryChip("监控模型", total));
-  summaryChips.appendChild(createSummaryChip("健康模型", healthy, "good"));
-  summaryChips.appendChild(createSummaryChip("异常模型", issues, issues > 0 ? "danger" : "default"));
-  summaryChips.appendChild(createSummaryChip("最近探测", latest ? formatDateTime(latest) : "暂无数据"));
-}
-function renderModel(model) {
   const card = document.createElement("article");
-  card.className = "model-card";
-  const status = model.status || "unknown";
-  const points = (model.hourly || []).slice(-12);
-  const successRate = typeof model.success_rate === "number" ? `${model.success_rate.toFixed(1)}%` : model.beat || "--";
-  card.innerHTML = `
-    <div class="card-top">
-      <div>
-        <div class="card-title">${model.display_name || model.name || model.model_id}</div>
-        <div class="model-subtitle">${model.model_id || "--"}</div>
-      </div>
-      <span class="status-chip ${status}">${STATUS_LABELS[status] || "未巡检"}</span>
-    </div>
-    <div class="metric-row">
-      <div class="metric-pill">
-        <span>成功率</span>
-        <strong>${successRate}</strong>
       </div>
-      <div class="metric-pill">
-        <span>最近探测</span>
-        <strong>${formatDateTime(model.last_healthcheck_at)}</strong>
       </div>
-    </div>
-  `;
-  const timeline = document.createElement("div");
-  timeline.className = "timeline";
-  if (points.length === 0) {
-    const empty = document.createElement("div");
-    empty.className = "empty-state";
-    empty.textContent = "暂无巡检记录";
-    timeline.appendChild(empty);
-  } else {
-    points.forEach((segment) => timeline.appendChild(formatHourSegment(segment)));
   }
-  card.appendChild(timeline);
-  return card;
 }
-async function loadHealth() {
   try {
-    errorText.textContent = "";
-    const response = await fetch("/api/health/public", { headers: { Accept: "application/json" } });
-    if (!response.ok) {
-      throw new Error("健康接口暂时不可用");
-    }
-    const payload = await response.json();
-    const models = payload.models || [];
-    renderSummary(models);
-    modelGrid.innerHTML = "";
-    models.forEach((model) => modelGrid.appendChild(renderModel(model)));
-    lastUpdated.textContent = payload.last_updated ? formatDateTime(payload.last_updated) : formatDateTime(new Date().toISOString());
-  } catch (_error) {
-    errorText.textContent = "当前无法获取 NVIDIA NIM 的巡检结果，请检查后台配置或稍后再试。";
-    lastUpdated.textContent = "--";
   }
 }
 window.addEventListener("DOMContentLoaded", () => {
-  loadHealth();
-  setInterval(loadHealth, 60 * 1000);
 });

+const pageTrack = document.getElementById("page-track");
+const navButtons = document.querySelectorAll(".page-nav-btn");
+const jumpButtons = document.querySelectorAll("[data-jump]");
+const overviewCards = document.getElementById("overview-cards");
+const dashboardUpdated = document.getElementById("dashboard-updated");
+const healthGrid = document.getElementById("health-grid");
+const dashboardEmpty = document.getElementById("dashboard-empty");
+const catalogSummary = document.getElementById("catalog-summary");
+const catalogUpdated = document.getElementById("catalog-updated");
+const providerGrid = document.getElementById("provider-grid");
+const catalogEmpty = document.getElementById("catalog-empty");
+const refreshDashboardBtn = document.getElementById("refresh-dashboard");
+const STATUS_CLASSES = {
+  green: "is-green",
+  yellow: "is-yellow",
+  orange: "is-orange",
+  red: "is-red",
+  idle: "is-idle",
 };
 const dateTimeFormatter = new Intl.DateTimeFormat("zh-CN", {
   minute: "2-digit",
 });
+let currentPage = 0;
+let wheelLocked = false;
 function formatDateTime(value) {
   if (!value) return "--";
   const date = new Date(value);
   return dateTimeFormatter.format(date);
 }
+function rateMeta(rate) {
+  if (rate === null || rate === undefined) return { label: "暂无数据", tone: "idle" };
+  if (rate >= 95) return { label: "优秀", tone: "green" };
+  if (rate >= 80) return { label: "良好", tone: "yellow" };
+  if (rate >= 50) return { label: "告警", tone: "orange" };
+  return { label: "异常", tone: "red" };
 }
+function switchPage(nextPage) {
+  currentPage = Math.max(0, Math.min(1, nextPage));
+  pageTrack.style.transform = `translate3d(${-50 * currentPage}%, 0, 0)`;
+  navButtons.forEach((button) => button.classList.toggle("active", Number(button.dataset.page) === currentPage));
+  document.querySelectorAll(".page-section").forEach((section) => {
+    section.classList.toggle("active", Number(section.dataset.page) === currentPage);
+  });
 }
+navButtons.forEach((button) => button.addEventListener("click", () => switchPage(Number(button.dataset.page))));
+jumpButtons.forEach((button) => button.addEventListener("click", () => switchPage(Number(button.dataset.jump))));
+window.addEventListener(
+  "wheel",
+  (event) => {
+    if (wheelLocked) return;
+    if (Math.abs(event.deltaX) > Math.abs(event.deltaY) || Math.abs(event.deltaY) < 20) return;
+    wheelLocked = true;
+    switchPage(currentPage + (event.deltaY > 0 ? 1 : -1));
+    window.setTimeout(() => {
+      wheelLocked = false;
+    }, 650);
+  },
+  { passive: true }
+);
+window.addEventListener("keydown", (event) => {
+  if (event.key === "ArrowRight") switchPage(1);
+  if (event.key === "ArrowLeft") switchPage(0);
+});
+function createOverviewCard(label, value, detail = "") {
   const card = document.createElement("article");
+  card.className = "overview-card";
+  card.innerHTML = `<span>${label}</span><strong>${value}</strong><p>${detail}</p>`;
+  return card;
+}
+function renderOverview(data) {
+  overviewCards.innerHTML = "";
+  const averageHealth = data.average_health === null || data.average_health === undefined ? "--" : `${data.average_health.toFixed(1)}%`;
+  const activeModels = (data.models || []).filter((model) => model.latest_success_rate !== null && model.latest_success_rate !== undefined).length;
+  const averageLatencyModels = (data.models || [])
+    .flatMap((model) => model.points || [])
+    .filter((point) => point.total_count > 0 && point.success_rate !== null && point.success_rate !== undefined);
+  overviewCards.appendChild(createOverviewCard("总调用次数", data.total_requests ?? 0, "来自网关历史累计转发记录"));
+  overviewCards.appendChild(createOverviewCard("平均健康度", averageHealth, "按监控模型最近 10 分钟成功率平均值计算"));
+  overviewCards.appendChild(createOverviewCard("活跃模型数", activeModels, "最近统计窗口内有调用记录的模型数量"));
+  overviewCards.appendChild(createOverviewCard("统计粒度", `${data.bucket_minutes} 分钟`, `已展示最近 ${data.models?.[0]?.points?.length || 0} 个时间片`));
+  dashboardUpdated.textContent = formatDateTime(data.generated_at);
+}
+function renderHealthCards(models) {
+  healthGrid.innerHTML = "";
+  if (!models || models.length === 0) {
+    dashboardEmpty.textContent = "当前未配置 MODEL_LIST，或尚无可展示的统计结果。";
+    return;
+  }
+  dashboardEmpty.textContent = "";
+  models.forEach((model) => {
+    const latestMeta = rateMeta(model.latest_success_rate);
+    const latestRate = model.latest_success_rate === null || model.latest_success_rate === undefined ? "--" : `${model.latest_success_rate.toFixed(1)}%`;
+    const averageRate = model.average_success_rate === null || model.average_success_rate === undefined ? "--" : `${model.average_success_rate.toFixed(1)}%`;
+    const card = document.createElement("article");
+    card.className = `health-card ${STATUS_CLASSES[latestMeta.tone]}`;
+    card.innerHTML = `
+      <div class="card-head">
+        <div>
+          <h3>${model.model_id}</h3>
+          <p>${model.provider}</p>
+        </div>
+        <span class="status-pill ${STATUS_CLASSES[latestMeta.tone]}">${latestMeta.label}</span>
       </div>
+      <div class="scoreboard">
+        <div class="score-item">
+          <span>最近 10 分钟</span>
+          <strong>${latestRate}</strong>
+        </div>
+        <div class="score-item">
+          <span>近 1 小时均值</span>
+          <strong>${averageRate}</strong>
+        </div>
+        <div class="score-item">
+          <span>累计调用次数</span>
+          <strong>${model.total_calls ?? 0}</strong>
+        </div>
       </div>
+    `;
+    const timeline = document.createElement("div");
+    timeline.className = "timeline-strip";
+    (model.points || []).forEach((point) => {
+      const meta = rateMeta(point.success_rate);
+      const item = document.createElement("div");
+      item.className = `timeline-box ${STATUS_CLASSES[meta.tone]}`;
+      item.innerHTML = `<span>${point.label}</span><strong>${point.success_rate === null || point.success_rate === undefined ? "--" : `${point.success_rate.toFixed(0)}%`}</strong>`;
+      item.title = `${point.label} 成功 ${point.success_count}/${point.total_count}`;
+      timeline.appendChild(item);
+    });
+    card.appendChild(timeline);
+    healthGrid.appendChild(card);
+  });
+}
+function renderCatalogSummary(data) {
+  catalogSummary.innerHTML = "";
+  catalogSummary.appendChild(createOverviewCard("官方模型总数", data.total_models ?? 0, "来自 NVIDIA 官方模型目录"));
+  catalogSummary.appendChild(createOverviewCard("提供商��量", data.providers?.length ?? 0, "按模型 ID 前缀自动归类"));
+}
+function renderCatalogProviders(providers) {
+  providerGrid.innerHTML = "";
+  if (!providers || providers.length === 0) {
+    catalogEmpty.textContent = "暂时还没有拉取到官方模型目录，请稍后刷新。";
+    return;
   }
+  catalogEmpty.textContent = "";
+  providers.forEach((group) => {
+    const card = document.createElement("article");
+    card.className = "provider-card";
+    card.innerHTML = `
+      <div class="provider-head">
+        <div>
+          <h3>${group.provider}</h3>
+          <p>${group.count} 个模型</p>
+        </div>
+      </div>
+    `;
+    const chipWrap = document.createElement("div");
+    chipWrap.className = "model-chip-wrap";
+    (group.models || []).forEach((model) => {
+      const chip = document.createElement("span");
+      chip.className = "model-chip";
+      chip.textContent = model.id;
+      chipWrap.appendChild(chip);
+    });
+    card.appendChild(chipWrap);
+    providerGrid.appendChild(card);
+  });
+}
+async function loadDashboard() {
+  const response = await fetch("/api/dashboard", { headers: { Accept: "application/json" } });
+  if (!response.ok) throw new Error("仪表盘数据加载失败");
+  const payload = await response.json();
+  renderOverview(payload);
+  renderHealthCards(payload.models || []);
 }
+async function loadCatalog() {
+  const response = await fetch("/api/catalog", { headers: { Accept: "application/json" } });
+  if (!response.ok) throw new Error("模型目录加载失败");
+  const payload = await response.json();
+  catalogUpdated.textContent = formatDateTime(payload.synced_at || payload.generated_at);
+  renderCatalogSummary(payload);
+  renderCatalogProviders(payload.providers || []);
+}
+async function refreshAll() {
   try {
+    await Promise.all([loadDashboard(), loadCatalog()]);
+  } catch (error) {
+    dashboardEmpty.textContent = error.message;
+    catalogEmpty.textContent = error.message;
   }
 }
+refreshDashboardBtn?.addEventListener("click", refreshAll);
 window.addEventListener("DOMContentLoaded", () => {
+  switchPage(0);
+  refreshAll();
+  window.setInterval(loadDashboard, 60 * 1000);
+  window.setInterval(loadCatalog, 5 * 60 * 1000);
 });

static/style.css CHANGED Viewed

@@ -1,20 +1,21 @@
 :root {
   --bg: #050816;
   --bg-deep: #02040b;
-  --panel: rgba(9, 15, 29, 0.82);
-  --panel-strong: rgba(12, 20, 38, 0.94);
-  --panel-soft: rgba(255, 255, 255, 0.04);
-  --text: #f4f7ff;
-  --muted: #94a7c7;
-  --muted-strong: #b4c6e4;
-  --line: rgba(255, 255, 255, 0.1);
   --line-strong: rgba(255, 255, 255, 0.16);
-  --green: #35f0a1;
-  --green-strong: #6bffd0;
-  --orange: #ffb44f;
-  --red: #ff6e83;
-  --shadow: 0 18px 48px rgba(0, 0, 0, 0.28);
-  --glow: 0 0 0 1px rgba(107, 255, 208, 0.16), 0 22px 54px rgba(30, 255, 179, 0.12);
   --font-sans: "Noto Sans SC", "PingFang SC", "Microsoft YaHei", sans-serif;
   --font-display: "Space Grotesk", "Noto Sans SC", sans-serif;
   color-scheme: dark;
@@ -31,12 +32,12 @@ body {
 body {
   margin: 0;
-  background:
-    radial-gradient(circle at 8% 12%, rgba(53, 240, 161, 0.16), transparent 28%),
-    radial-gradient(circle at 88% 18%, rgba(67, 138, 255, 0.18), transparent 26%),
-    linear-gradient(180deg, #08101d 0%, var(--bg) 38%, var(--bg-deep) 100%);
-  color: var(--text);
   font-family: var(--font-sans);
   overflow-x: hidden;
 }
@@ -45,631 +46,470 @@ body::before {
   position: fixed;
   inset: 0;
   background-image:
-    linear-gradient(rgba(255, 255, 255, 0.03) 1px, transparent 1px),
-    linear-gradient(90deg, rgba(255, 255, 255, 0.03) 1px, transparent 1px);
-  background-size: 34px 34px;
-  opacity: 0.12;
   pointer-events: none;
 }
 .ambient {
   position: fixed;
-  width: 32rem;
-  height: 32rem;
   border-radius: 999px;
-  filter: blur(90px);
-  opacity: 0.38;
   pointer-events: none;
 }
 .ambient-left {
-  top: -10rem;
   left: -8rem;
-  background: rgba(53, 240, 161, 0.28);
 }
 .ambient-right {
   top: 8rem;
   right: -10rem;
-  background: rgba(91, 113, 255, 0.22);
 }
-.public-shell,
-.admin-shell {
-  position: relative;
-  z-index: 1;
 }
-.public-shell {
-  width: min(1280px, calc(100vw - 32px));
   margin: 0 auto;
-  padding: 32px 0 54px;
-}
-.hero-panel,
-.summary-panel,
-.board-panel,
-.glass-panel,
-.metric-card,
-.health-record,
-.empty-card {
-  position: relative;
-  border: 1px solid var(--line);
-  background: linear-gradient(180deg, rgba(15, 25, 48, 0.82), rgba(7, 12, 24, 0.9));
-  border-radius: 28px;
-  box-shadow: var(--shadow);
-  backdrop-filter: blur(18px);
-}
-.hero-panel,
-.summary-panel,
-.board-panel,
-.glass-panel {
-  overflow: hidden;
-}
-.hero-panel::after,
-.summary-panel::after,
-.board-panel::after,
-.glass-panel::after {
-  content: "";
-  position: absolute;
-  inset: 0;
-  background: linear-gradient(135deg, rgba(107, 255, 208, 0.08), transparent 34%, transparent 66%, rgba(123, 157, 255, 0.08));
-  pointer-events: none;
 }
-.hero-panel {
-  display: grid;
-  grid-template-columns: minmax(0, 1.5fr) minmax(280px, 0.9fr);
-  gap: 24px;
-  padding: 34px;
 }
 .hero-copy h1,
-.hero-side .hero-value,
-.section-heading h2,
-.board-head h2,
-.panel-headline h2,
-.login-card h2,
-.brand-block h1 {
   font-family: var(--font-display);
 }
 .hero-badge,
 .section-tag {
   display: inline-flex;
   align-items: center;
-  gap: 8px;
   padding: 8px 14px;
   border-radius: 999px;
-  border: 1px solid rgba(107, 255, 208, 0.28);
-  background: rgba(53, 240, 161, 0.1);
-  color: var(--green-strong);
   font-size: 13px;
   font-weight: 700;
   letter-spacing: 0.08em;
 }
-.hero-copy h1,
-.board-head h2,
-.panel-headline h2,
-.section-heading h2,
-.login-card h2,
-.brand-block h1 {
-  margin: 16px 0 12px;
-  font-size: clamp(34px, 3vw, 52px);
-  line-height: 1.08;
-}
-.hero-copy p,
-.hero-side p,
-.board-note,
-.status-text,
-.metric-card p,
-.brand-block p {
-  color: var(--muted);
-  line-height: 1.7;
 }
-.hero-side {
-  padding: 24px;
-  border-radius: 24px;
-  background: rgba(255, 255, 255, 0.04);
-  border: 1px solid var(--line-strong);
-  box-shadow: inset 0 1px 0 rgba(255, 255, 255, 0.04);
 }
-.hero-kicker {
   color: var(--muted-strong);
-  letter-spacing: 0.16em;
-  text-transform: uppercase;
-  font-size: 12px;
 }
-.hero-value {
-  margin-top: 14px;
-  font-size: 32px;
   font-weight: 700;
 }
-.summary-panel,
-.board-panel {
-  padding: 28px 30px;
-  margin-top: 22px;
 }
-.section-heading,
-.board-head,
-.panel-headline,
-.toolbar-row {
   display: flex;
-  justify-content: space-between;
-  align-items: flex-start;
-  gap: 18px;
-  flex-wrap: wrap;
 }
-.section-heading h2,
-.board-head h2,
-.panel-headline h2,
-.panel-headline h3 {
-  margin-bottom: 0;
-  font-size: clamp(24px, 2vw, 34px);
 }
-.refresh-meta {
-  min-width: 220px;
-  padding: 16px 18px;
-  border-radius: 20px;
-  background: rgba(255, 255, 255, 0.04);
   border: 1px solid var(--line);
 }
-.refresh-meta span {
-  display: block;
   color: var(--muted);
-  font-size: 13px;
-  margin-bottom: 8px;
 }
-.refresh-meta strong {
-  font-family: var(--font-display);
-  font-size: 20px;
 }
-.summary-strip {
-  margin-top: 22px;
   display: grid;
-  grid-template-columns: repeat(auto-fit, minmax(180px, 1fr));
   gap: 14px;
 }
-.summary-chip,
-.metric-card {
   padding: 18px 20px;
-  border-radius: 22px;
-  background: linear-gradient(180deg, rgba(255, 255, 255, 0.06), rgba(255, 255, 255, 0.03));
-  border: 1px solid rgba(255, 255, 255, 0.08);
 }
-.summary-chip span,
-.metric-card h3 {
-  display: block;
   color: var(--muted);
   font-size: 13px;
-  margin-bottom: 10px;
 }
-.summary-chip strong,
-.metric-card strong {
   font-family: var(--font-display);
-  font-size: 28px;
-  line-height: 1.2;
 }
-.summary-chip.good {
-  box-shadow: var(--glow);
 }
-.summary-chip.danger {
-  border-color: rgba(255, 110, 131, 0.32);
-  background: linear-gradient(180deg, rgba(255, 110, 131, 0.16), rgba(255, 255, 255, 0.03));
 }
-.board-head {
-  margin-bottom: 18px;
 }
-.model-grid,
-.section-grid {
   display: grid;
   grid-template-columns: repeat(auto-fit, minmax(280px, 1fr));
   gap: 18px;
 }
-.model-card {
   padding: 22px;
-  border-radius: 24px;
-  background: linear-gradient(180deg, rgba(255, 255, 255, 0.06), rgba(255, 255, 255, 0.025));
-  border: 1px solid rgba(255, 255, 255, 0.08);
-  transition: transform 0.22s ease, border-color 0.22s ease, box-shadow 0.22s ease;
 }
-.model-card:hover,
-.health-record:hover {
-  transform: translateY(-3px);
-  border-color: rgba(107, 255, 208, 0.28);
   box-shadow: var(--glow);
 }
-.card-top,
-.metric-row,
-.record-meta,
-.health-meta,
-.inline-actions {
   display: flex;
-  gap: 10px;
-  flex-wrap: wrap;
-}
-.card-top {
   justify-content: space-between;
   align-items: flex-start;
 }
-.card-title {
   font-size: 22px;
-  font-weight: 700;
-  line-height: 1.3;
 }
-.model-subtitle,
-.mono {
-  font-family: var(--font-display);
-  color: var(--muted);
-  font-size: 13px;
 }
-.status-chip,
-.pill {
-  display: inline-flex;
-  align-items: center;
-  justify-content: center;
   padding: 8px 12px;
-  min-height: 34px;
   border-radius: 999px;
   font-size: 12px;
   font-weight: 700;
-  letter-spacing: 0.05em;
   border: 1px solid transparent;
 }
-.status-chip.healthy,
-.pill.healthy,
-.pill.good {
-  color: var(--green-strong);
-  background: rgba(53, 240, 161, 0.12);
-  border-color: rgba(107, 255, 208, 0.28);
-}
-.status-chip.down,
-.pill.down,
-.danger-btn {
-  color: #ffb5c0;
-  background: rgba(255, 110, 131, 0.12);
-  border-color: rgba(255, 110, 131, 0.28);
-}
-.status-chip.degraded,
-.status-chip.warn,
-.pill.degraded,
-.pill.warn {
-  color: #ffd18d;
-  background: rgba(255, 180, 79, 0.12);
-  border-color: rgba(255, 180, 79, 0.28);
-}
-.status-chip.unknown,
-.pill.unknown,
-.pill.idle {
-  color: #c7d4ea;
-  background: rgba(255, 255, 255, 0.06);
-  border-color: rgba(255, 255, 255, 0.1);
-}
-.metric-row {
   margin-top: 18px;
 }
-.metric-pill {
-  flex: 1 1 140px;
   padding: 14px 16px;
   border-radius: 18px;
-  background: rgba(255, 255, 255, 0.04);
-  border: 1px solid rgba(255, 255, 255, 0.06);
 }
-.metric-pill span {
   display: block;
   color: var(--muted);
   font-size: 12px;
-  margin-bottom: 8px;
 }
-.metric-pill strong {
   font-family: var(--font-display);
-  font-size: 17px;
 }
-.timeline {
-  display: flex;
   gap: 10px;
-  flex-wrap: wrap;
   margin-top: 18px;
 }
-.timeline-item {
-  width: 40px;
-  height: 40px;
-  border-radius: 14px;
-  display: inline-flex;
-  align-items: center;
-  justify-content: center;
-  font-family: var(--font-display);
-  font-size: 13px;
-  font-weight: 700;
-  background: rgba(255, 255, 255, 0.04);
   border: 1px solid rgba(255, 255, 255, 0.08);
 }
-.timeline-item.ok {
-  background: linear-gradient(135deg, rgba(53, 240, 161, 0.92), rgba(107, 255, 208, 0.92));
-  color: #04110d;
-  border-color: transparent;
-}
-.timeline-item.warn {
-  background: linear-gradient(135deg, rgba(255, 180, 79, 0.9), rgba(255, 218, 131, 0.85));
-  color: #1b1203;
-  border-color: transparent;
-}
-.timeline-item.down {
-  background: linear-gradient(135deg, rgba(255, 110, 131, 0.95), rgba(255, 171, 128, 0.84));
-  color: #19080d;
-  border-color: transparent;
-}
-.timeline-item.idle {
   color: var(--muted);
 }
-.empty-state,
-.empty-card {
-  width: 100%;
-  padding: 18px;
-  border-radius: 18px;
-  color: var(--muted);
-  background: rgba(255, 255, 255, 0.04);
-  border: 1px dashed rgba(255, 255, 255, 0.12);
 }
-.error-text {
   margin-top: 18px;
-  color: #ffb2bf;
-}
-button,
-.secondary-btn {
-  border: none;
-  cursor: pointer;
-  font: inherit;
-  transition: transform 0.2s ease, opacity 0.2s ease, border-color 0.2s ease;
 }
-button:hover,
-.secondary-btn:hover {
-  transform: translateY(-1px);
-}
-button {
-  padding: 12px 18px;
-  border-radius: 16px;
-  background: linear-gradient(135deg, #2be89a, #64ffd6);
-  color: #03110d;
-  font-weight: 800;
-  box-shadow: 0 14px 30px rgba(53, 240, 161, 0.18);
-}
-.secondary-btn {
   padding: 10px 14px;
-  border-radius: 14px;
   background: rgba(255, 255, 255, 0.05);
-  border: 1px solid rgba(255, 255, 255, 0.14);
-  color: var(--text);
-}
-.admin-shell {
-  display: grid;
-  grid-template-columns: 320px minmax(0, 1fr);
-  min-height: 100vh;
-}
-.admin-sidebar {
-  padding: 28px 22px;
-  border-right: 1px solid rgba(255, 255, 255, 0.08);
-  background: rgba(4, 8, 17, 0.88);
-  backdrop-filter: blur(16px);
-}
-.brand-block {
-  padding: 22px;
-  border-radius: 24px;
-  background: linear-gradient(180deg, rgba(255, 255, 255, 0.07), rgba(255, 255, 255, 0.03));
-  border: 1px solid rgba(255, 255, 255, 0.08);
-  margin-bottom: 24px;
-}
-.brand-block h1 {
-  font-size: 32px;
-}
-.admin-sidebar h3 {
-  margin: 0 0 12px;
-  color: var(--muted);
-  font-size: 13px;
-  letter-spacing: 0.18em;
-}
-.admin-sidebar .sidebar-btn {
-  width: 100%;
-  margin-bottom: 10px;
-  padding: 14px 16px;
-  text-align: left;
-  color: var(--text);
-  background: rgba(255, 255, 255, 0.04);
   border: 1px solid rgba(255, 255, 255, 0.08);
-  box-shadow: none;
-}
-.admin-sidebar .sidebar-btn.active {
-  color: var(--green-strong);
-  background: rgba(53, 240, 161, 0.1);
-  border-color: rgba(107, 255, 208, 0.22);
-  box-shadow: var(--glow);
-}
-.admin-content {
-  padding: 30px;
-}
-.glass-panel {
-  padding: 28px;
-  margin-bottom: 22px;
-}
-.sub-panel {
-  margin-top: 20px;
-}
-.panel-headline.compact h3 {
-  margin-top: 8px;
-}
-.table {
-  width: 100%;
-  border-collapse: collapse;
-}
-.table thead th {
-  padding: 14px 12px;
-  text-align: left;
-  color: var(--muted);
   font-size: 13px;
-  border-bottom: 1px solid rgba(255, 255, 255, 0.12);
-}
-.table tbody td {
-  padding: 16px 12px;
-  border-bottom: 1px solid rgba(255, 255, 255, 0.06);
-  vertical-align: top;
 }
-.form-grid {
-  display: grid;
-  grid-template-columns: repeat(2, minmax(0, 1fr));
-  gap: 14px;
-}
-.compact-grid {
-  grid-template-columns: repeat(auto-fit, minmax(200px, 1fr));
-}
-.form-grid input,
-.form-grid textarea,
-.login-card input {
-  width: 100%;
-  padding: 14px 16px;
-  border-radius: 16px;
-  border: 1px solid rgba(255, 255, 255, 0.1);
-  background: rgba(255, 255, 255, 0.045);
-  color: var(--text);
-  font: inherit;
-  outline: none;
-}
-.form-grid input:focus,
-.form-grid textarea:focus,
-.login-card input:focus {
-  border-color: rgba(107, 255, 208, 0.32);
-  box-shadow: 0 0 0 4px rgba(53, 240, 161, 0.08);
 }
-.form-grid textarea {
-  min-height: 120px;
-  resize: vertical;
-  grid-column: 1 / -1;
 }
-.checkbox-row {
-  display: flex;
-  align-items: center;
-  gap: 12px;
-  color: var(--text);
 }
-.checkbox-row input {
-  width: 18px;
-  height: 18px;
 }
-.spaced-top {
-  margin-top: 18px;
 }
-.health-record,
-.empty-card {
-  padding: 20px;
-  border-radius: 22px;
 }
-.health-record h4,
-.login-card h2 {
-  margin: 0;
 }
-.record-meta {
-  margin-top: 16px;
-  color: var(--muted);
-  font-size: 13px;
 }
-.login-overlay {
-  position: fixed;
-  inset: 0;
-  display: flex;
-  align-items: center;
-  justify-content: center;
-  background: rgba(3, 6, 14, 0.76);
-  backdrop-filter: blur(12px);
-  z-index: 20;
 }
-.login-card {
-  width: min(460px, calc(100vw - 32px));
-  padding: 30px;
-  border-radius: 28px;
-  background: linear-gradient(180deg, rgba(13, 22, 41, 0.96), rgba(8, 12, 24, 0.96));
-  border: 1px solid rgba(255, 255, 255, 0.12);
-  box-shadow: var(--shadow);
-}
-.login-card label {
-  display: block;
-  margin: 16px 0 10px;
-  color: var(--muted-strong);
-  font-size: 13px;
 }
 .hidden {
@@ -677,99 +517,53 @@ button {
 }
 @media (max-width: 980px) {
-  .hero-panel {
-    grid-template-columns: 1fr;
   }
-  .admin-shell {
     grid-template-columns: 1fr;
   }
-  .admin-sidebar {
-    position: sticky;
-    top: 0;
-    z-index: 5;
   }
 }
 @media (max-width: 720px) {
-  .public-shell {
-    width: min(100vw - 20px, 1280px);
-    padding-top: 20px;
-  }
-  .hero-panel,
-  .summary-panel,
-  .board-panel,
-  .glass-panel,
-  .admin-content {
-    padding: 20px;
   }
-  .form-grid {
-    grid-template-columns: 1fr;
   }
-  .admin-sidebar {
-    padding: 18px;
   }
-  .summary-strip,
-  .model-grid,
-  .section-grid {
-    grid-template-columns: 1fr;
   }
-}
-.panel-actions {
-  align-items: center;
-}
-.settings-grid {
-  align-items: start;
-}
-.field-span-full {
-  grid-column: 1 / -1;
-}
-.settings-grid .checkbox-row {
-  min-height: 58px;
-  padding: 14px 16px;
-  border-radius: 16px;
-  border: 1px solid rgba(255, 255, 255, 0.1);
-  background: rgba(255, 255, 255, 0.045);
-}
-.settings-grid .checkbox-row input {
-  width: 18px;
-  min-width: 18px;
-  height: 18px;
-  padding: 0;
-  margin: 0;
-  border-radius: 6px;
-  background: rgba(255, 255, 255, 0.02);
-  box-shadow: none;
-}
-.settings-actions {
-  justify-content: flex-start;
-  align-items: center;
-}
-.form-grid > button {
-  min-height: 54px;
-  justify-self: start;
-}
-@media (max-width: 720px) {
-  .settings-actions {
-    flex-direction: column;
-    align-items: stretch;
   }
-  .settings-actions button,
-  .panel-actions button,
-  .form-grid > button {
-    width: 100%;
   }
 }

 :root {
   --bg: #050816;
   --bg-deep: #02040b;
+  --panel: rgba(10, 15, 28, 0.72);
+  --panel-strong: rgba(14, 20, 36, 0.92);
+  --panel-soft: rgba(255, 255, 255, 0.05);
+  --line: rgba(255, 255, 255, 0.09);
   --line-strong: rgba(255, 255, 255, 0.16);
+  --text: #f4f7ff;
+  --muted: #91a3c4;
+  --muted-strong: #c7d4ea;
+  --green: #38f3a5;
+  --yellow: #ffd24d;
+  --orange: #ff9b43;
+  --red: #ff637f;
+  --idle: #5e6d88;
+  --shadow: 0 24px 60px rgba(0, 0, 0, 0.28);
+  --glow: 0 0 0 1px rgba(56, 243, 165, 0.12), 0 24px 70px rgba(56, 243, 165, 0.12);
   --font-sans: "Noto Sans SC", "PingFang SC", "Microsoft YaHei", sans-serif;
   --font-display: "Space Grotesk", "Noto Sans SC", sans-serif;
   color-scheme: dark;
 body {
   margin: 0;
   font-family: var(--font-sans);
+  color: var(--text);
+  background:
+    radial-gradient(circle at 12% 16%, rgba(56, 243, 165, 0.18), transparent 24%),
+    radial-gradient(circle at 84% 18%, rgba(79, 125, 255, 0.18), transparent 26%),
+    linear-gradient(180deg, #08111f 0%, var(--bg) 42%, var(--bg-deep) 100%);
   overflow-x: hidden;
 }
   position: fixed;
   inset: 0;
   background-image:
+    linear-gradient(rgba(255, 255, 255, 0.035) 1px, transparent 1px),
+    linear-gradient(90deg, rgba(255, 255, 255, 0.035) 1px, transparent 1px);
+  background-size: 36px 36px;
+  opacity: 0.08;
   pointer-events: none;
 }
 .ambient {
   position: fixed;
   border-radius: 999px;
+  filter: blur(100px);
   pointer-events: none;
+  opacity: 0.36;
 }
 .ambient-left {
+  width: 26rem;
+  height: 26rem;
+  top: -8rem;
   left: -8rem;
+  background: rgba(56, 243, 165, 0.36);
 }
 .ambient-right {
+  width: 32rem;
+  height: 32rem;
   top: 8rem;
   right: -10rem;
+  background: rgba(110, 125, 255, 0.28);
 }
+.ambient-bottom {
+  width: 28rem;
+  height: 28rem;
+  bottom: -10rem;
+  right: 18%;
+  background: rgba(255, 171, 72, 0.18);
 }
+.topbar {
+  position: sticky;
+  top: 0;
+  z-index: 30;
+  width: min(1360px, calc(100vw - 32px));
   margin: 0 auto;
+  padding: 22px 0 12px;
+  display: flex;
+  justify-content: space-between;
+  align-items: center;
+  gap: 18px;
+  backdrop-filter: blur(10px);
 }
+.brand-mark {
+  display: flex;
+  flex-direction: column;
+  gap: 6px;
 }
+.brand-mark strong,
 .hero-copy h1,
+.section-head h2,
+.provider-head h3,
+.health-card h3 {
   font-family: var(--font-display);
 }
+.brand-badge,
 .hero-badge,
 .section-tag {
   display: inline-flex;
   align-items: center;
+  width: fit-content;
   padding: 8px 14px;
   border-radius: 999px;
+  border: 1px solid rgba(56, 243, 165, 0.26);
+  background: rgba(56, 243, 165, 0.1);
+  color: #8affd0;
   font-size: 13px;
   font-weight: 700;
   letter-spacing: 0.08em;
 }
+.page-nav {
+  display: inline-flex;
+  gap: 10px;
+  padding: 8px;
+  border-radius: 999px;
+  background: rgba(255, 255, 255, 0.05);
+  border: 1px solid var(--line);
+  box-shadow: var(--shadow);
 }
+.page-nav-btn,
+.primary-btn,
+.ghost-btn {
+  border: none;
+  cursor: pointer;
+  font: inherit;
+  transition: transform 0.22s ease, opacity 0.22s ease, background 0.22s ease, border-color 0.22s ease;
 }
+.page-nav-btn {
+  padding: 10px 16px;
+  border-radius: 999px;
+  background: transparent;
   color: var(--muted-strong);
 }
+.page-nav-btn.active {
+  color: #04110d;
+  background: linear-gradient(135deg, var(--green), #7affd8);
   font-weight: 700;
 }
+.showcase-shell {
+  position: relative;
+  width: min(1360px, calc(100vw - 32px));
+  margin: 0 auto 42px;
+  overflow: hidden;
 }
+.page-track {
   display: flex;
+  width: 200%;
+  transform: translate3d(0, 0, 0);
+  transition: transform 0.82s cubic-bezier(0.22, 1, 0.36, 1);
 }
+.page-section {
+  width: 50%;
+  padding-right: 18px;
+  opacity: 0.74;
+  transform: scale(0.985);
+  transition: opacity 0.55s ease, transform 0.55s ease;
 }
+.page-section.active {
+  opacity: 1;
+  transform: scale(1);
+}
+.hero-stage,
+.panel-shell,
+.overview-card,
+.health-card,
+.provider-card {
+  position: relative;
+  border-radius: 28px;
   border: 1px solid var(--line);
+  background: linear-gradient(180deg, rgba(17, 25, 44, 0.84), rgba(8, 12, 22, 0.95));
+  box-shadow: var(--shadow);
+  backdrop-filter: blur(18px);
+  overflow: hidden;
 }
+.hero-stage::after,
+.panel-shell::after,
+.health-card::after,
+.provider-card::after,
+.overview-card::after {
+  content: "";
+  position: absolute;
+  inset: 0;
+  background: linear-gradient(140deg, rgba(138, 255, 208, 0.08), transparent 32%, transparent 68%, rgba(116, 126, 255, 0.08));
+  pointer-events: none;
+}
+.hero-stage {
+  display: grid;
+  grid-template-columns: minmax(0, 1.25fr) minmax(280px, 0.9fr);
+  gap: 24px;
+  padding: 34px;
+}
+.catalog-stage {
+  grid-template-columns: minmax(0, 1.1fr) minmax(300px, 0.9fr);
+}
+.hero-copy h1 {
+  margin: 16px 0 12px;
+  font-size: clamp(34px, 3vw, 58px);
+  line-height: 1.06;
+}
+.hero-copy p,
+.section-head p,
+.brand-mark strong,
+.provider-head p,
+.panel-hint,
+.health-card p,
+.overview-card p {
   color: var(--muted);
+  line-height: 1.75;
 }
+.hero-actions {
+  margin-top: 28px;
+  display: flex;
+  gap: 14px;
+  flex-wrap: wrap;
 }
+.primary-btn,
+.ghost-btn {
+  padding: 14px 20px;
+  border-radius: 18px;
+  font-weight: 700;
+}
+.primary-btn {
+  background: linear-gradient(135deg, var(--green), #7affd8);
+  color: #04110d;
+  box-shadow: var(--glow);
+}
+.ghost-btn {
+  background: rgba(255, 255, 255, 0.05);
+  color: var(--text);
+  border: 1px solid rgba(255, 255, 255, 0.12);
+}
+.hero-metrics,
+.catalog-summary {
   display: grid;
+  grid-template-columns: repeat(2, minmax(0, 1fr));
   gap: 14px;
+  align-content: start;
 }
+.overview-card {
   padding: 18px 20px;
 }
+.overview-card span {
   color: var(--muted);
   font-size: 13px;
 }
+.overview-card strong {
+  display: block;
+  margin: 10px 0 8px;
   font-family: var(--font-display);
+  font-size: 30px;
 }
+.panel-shell {
+  margin-top: 22px;
+  padding: 28px;
 }
+.section-head {
+  display: flex;
+  justify-content: space-between;
+  align-items: flex-start;
+  gap: 18px;
+  flex-wrap: wrap;
+  margin-bottom: 20px;
+}
+.section-head h2 {
+  margin: 12px 0 0;
+  font-size: clamp(26px, 2vw, 38px);
+}
+.section-meta {
+  min-width: 220px;
+  padding: 16px 18px;
+  border-radius: 22px;
+  background: rgba(255, 255, 255, 0.05);
+  border: 1px solid var(--line);
 }
+.section-meta span {
+  display: block;
+  color: var(--muted);
+  font-size: 13px;
+  margin-bottom: 8px;
+}
+.section-meta strong {
+  font-family: var(--font-display);
+  font-size: 20px;
 }
+.health-grid,
+.provider-grid {
   display: grid;
   grid-template-columns: repeat(auto-fit, minmax(280px, 1fr));
   gap: 18px;
 }
+.health-card,
+.provider-card {
   padding: 22px;
+  transition: transform 0.26s ease, border-color 0.26s ease, box-shadow 0.26s ease;
 }
+.health-card:hover,
+.provider-card:hover,
+.overview-card:hover {
+  transform: translateY(-4px);
+  border-color: rgba(255, 255, 255, 0.16);
   box-shadow: var(--glow);
 }
+.card-head,
+.provider-head {
   display: flex;
   justify-content: space-between;
+  gap: 12px;
   align-items: flex-start;
 }
+.health-card h3,
+.provider-head h3 {
+  margin: 0;
   font-size: 22px;
 }
+.health-card p,
+.provider-head p {
+  margin: 8px 0 0;
 }
+.status-pill {
   padding: 8px 12px;
   border-radius: 999px;
   font-size: 12px;
   font-weight: 700;
   border: 1px solid transparent;
 }
+.scoreboard {
+  display: grid;
+  grid-template-columns: repeat(3, minmax(0, 1fr));
+  gap: 12px;
   margin-top: 18px;
 }
+.score-item {
   padding: 14px 16px;
   border-radius: 18px;
+  background: rgba(255, 255, 255, 0.045);
+  border: 1px solid rgba(255, 255, 255, 0.08);
 }
+.score-item span {
   display: block;
   color: var(--muted);
   font-size: 12px;
 }
+.score-item strong {
+  display: block;
+  margin-top: 10px;
   font-family: var(--font-display);
+  font-size: 18px;
 }
+.timeline-strip {
+  display: grid;
+  grid-template-columns: repeat(auto-fit, minmax(70px, 1fr));
   gap: 10px;
   margin-top: 18px;
 }
+.timeline-box {
+  padding: 12px 10px;
+  border-radius: 16px;
   border: 1px solid rgba(255, 255, 255, 0.08);
+  text-align: center;
+  background: rgba(255, 255, 255, 0.04);
 }
+.timeline-box span {
+  display: block;
   color: var(--muted);
+  font-size: 12px;
 }
+.timeline-box strong {
+  display: block;
+  margin-top: 8px;
+  font-family: var(--font-display);
+  font-size: 18px;
 }
+.model-chip-wrap {
+  display: flex;
+  flex-wrap: wrap;
+  gap: 10px;
   margin-top: 18px;
 }
+.model-chip {
+  display: inline-flex;
+  align-items: center;
   padding: 10px 14px;
+  border-radius: 999px;
   background: rgba(255, 255, 255, 0.05);
   border: 1px solid rgba(255, 255, 255, 0.08);
+  color: var(--muted-strong);
   font-size: 13px;
 }
+.panel-hint {
+  margin-top: 16px;
 }
+.is-green {
+  border-color: rgba(56, 243, 165, 0.26);
 }
+.is-green .status-pill,
+.is-green.timeline-box,
+.timeline-box.is-green {
+  background: rgba(56, 243, 165, 0.16);
+  color: #8affd0;
+  border-color: rgba(56, 243, 165, 0.3);
 }
+.is-yellow {
+  border-color: rgba(255, 210, 77, 0.22);
 }
+.is-yellow .status-pill,
+.is-yellow.timeline-box,
+.timeline-box.is-yellow {
+  background: rgba(255, 210, 77, 0.14);
+  color: #ffe59b;
+  border-color: rgba(255, 210, 77, 0.28);
 }
+.is-orange {
+  border-color: rgba(255, 155, 67, 0.26);
 }
+.is-orange .status-pill,
+.is-orange.timeline-box,
+.timeline-box.is-orange {
+  background: rgba(255, 155, 67, 0.15);
+  color: #ffc48a;
+  border-color: rgba(255, 155, 67, 0.3);
 }
+.is-red {
+  border-color: rgba(255, 99, 127, 0.28);
 }
+.is-red .status-pill,
+.is-red.timeline-box,
+.timeline-box.is-red {
+  background: rgba(255, 99, 127, 0.16);
+  color: #ffbbca;
+  border-color: rgba(255, 99, 127, 0.32);
 }
+.is-idle .status-pill,
+.is-idle.timeline-box,
+.timeline-box.is-idle {
+  background: rgba(94, 109, 136, 0.15);
+  color: #c5d0df;
+  border-color: rgba(94, 109, 136, 0.26);
 }
 .hidden {
 }
 @media (max-width: 980px) {
+  .topbar,
+  .showcase-shell {
+    width: min(100vw - 24px, 1360px);
   }
+  .hero-stage,
+  .catalog-stage {
     grid-template-columns: 1fr;
   }
+  .hero-metrics,
+  .catalog-summary,
+  .scoreboard {
+    grid-template-columns: repeat(2, minmax(0, 1fr));
   }
 }
 @media (max-width: 720px) {
+  .topbar {
+    flex-direction: column;
+    align-items: stretch;
   }
+  .page-nav {
+    width: 100%;
+    justify-content: space-between;
   }
+  .page-nav-btn {
+    flex: 1;
   }
+  .page-section {
+    padding-right: 10px;
   }
+  .hero-stage,
+  .panel-shell {
+    padding: 20px;
   }
+  .hero-metrics,
+  .catalog-summary,
+  .health-grid,
+  .provider-grid,
+  .scoreboard,
+  .timeline-strip {
+    grid-template-columns: 1fr;
   }
 }