PAAC: Privacy-Aware Agentic Device-Cloud Collaboration
Abstract
PAAC is a privacy-aware agentic framework that aligns planner-executor decomposition with device-cloud boundaries, using typed placeholder tokens and deterministic registries to enhance privacy while maintaining accuracy in distributed language model agents.
Large language model (LLM) agents face a structural tension: cloud agents provide strong reasoning but expose user data, while on-device agents preserve privacy at the cost of overall capability. Existing device-cloud designs treat this boundary as a compute split rather than a trust boundary suited to agentic workloads, and existing sanitizers force a choice between policy flexibility and the structural fidelity tool calls require. In this work, we develop PAAC, a privacy-aware agentic framework that aligns planner--executor decomposition with the device-cloud boundary so that role specialization itself becomes the privacy mechanism. The cloud agent reasons over typed placeholder tokens that preserve each sensitive value's reasoning role while discarding its content, while the on-device agent identifies sensitive spans and distills each step's execution outcome into compact key findings. Sanitization confines the on-device LLM to proposing which spans to mask, while a deterministic registry performs all substitution and reversal, keeping actions directly executable on device. On three agentic benchmarks under strict privacy settings, PAAC dominates the Pareto frontier of privacy and accuracy, improving average accuracy by 15-36\% and reducing average leakage by 2-6times over state-of-the-art device-cloud baselines, with the largest margins on privacy targets outside fixed entity taxonomies. We find consistent improvements on 17 additional benchmarks spanning 10 domains, including math, science, and finance.
Community
🔑 TL;DR
PAAC reframes the device-cloud split as a trust boundary rather than a compute split, with two contributions working in tandem: a decoupled agentic architecture and an LLM-driven privacy sanitizer.
🤝 Decoupled Architecture
Cloud-reason-and-plan, device-execute-and-judge. The cloud agent reasons and plans over typed placeholder tokens (e.g., {BALANCE: ...}); the on-device agent identifies sensitive spans, executes tools with real values, and distills each step's outcome into compact key findings. Role specialization itself becomes the privacy mechanism, and per-step distillation keeps each agent's input compact across turns, avoiding the trajectory-coupled context growth that breaks single-agent pipelines.
⚙️ Proposer–Verifier–Registry Sanitization
The on-device LLM only proposes (span, proxy token) pairs; a deterministic append-only regex registry handles all substitution and reversal. This preserves tool-call fidelity, gives cross-round consistency, and locks in first-turn protection even if the on-device LLM is later compromised.
📊 Results (Qwen3-4B + Gemini 3 Flash)
- 📈 +15-36% accuracy and 2-6× lower leakage vs SOTA device-cloud baselines on $\tau^2$-Bench Airline/Retail and GAIA
- 🎯 0% leakage on open-vocab targets (CLUTRR names) where pattern-based methods hit 38.6%
- 🪶 Stable accuracy and token cost as privacy tightens; gains hold across 17 more benchmarks in 10 domains
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper