README / README.md
ShuhongWu's picture
Update README.md
f873876 verified
---
title: README
emoji:
colorFrom: indigo
colorTo: purple
sdk: static
pinned: false
license: mit
---
# AtomGradient — Bringing AI to the Edge
**We are an independent research group dedicated to making AI run efficiently on edge devices.**
We believe powerful AI should be private, accessible, and free from cloud dependency. All our research is open-source.
🌐 [atomgradient.com](https://atomgradient.com) · 🐙 [GitHub](https://github.com/AtomGradient) · 🚀 [EchoStream AI](https://www.echostream-ai.com/)
---
## Research
### [Prism — Cross-Domain Personal Data Integration on Consumer Hardware](https://atomgradient.github.io/Prism/)
Integrating finance, diet, mood, and reading data entirely on consumer Apple Silicon, producing emergent cross-domain insights with zero data leakage.
- 📈 **1.48x** cross-domain insight emergence (IIR)
- 🔒 **125.5x** federation compression, zero data leakage
-**49.9 TPS** real-time inference (35B on M2 Ultra)
[[GitHub]](https://github.com/AtomGradient/Prism) · [[Paper]](https://atomgradient.github.io/Prism/)
---
### [ANE Batch Prefill — On-Device Parallel LLM Inference](https://atomgradient.github.io/hybird-batch-prefill-on-ane/)
Fused matrix-vector kernels enabling concurrent ANE batch prefill + GPU decode on Apple Silicon for Qwen3.5 models.
- 🚀 **11.3x** ANE batch prefill speedup (268 tok/s)
- 🔋 **79%** power reduction for prefill component
- ⏱️ **<30 ms** state transfer overhead
[[GitHub]](https://github.com/AtomGradient/hybird-batch-prefill-on-ane) · [[Paper]](https://atomgradient.github.io/hybird-batch-prefill-on-ane/)
---
### [hybrid-ane-mlx-bench — Disaggregated LLM Inference on Apple Silicon](https://atomgradient.github.io/hybrid-ane-mlx-bench/)
Benchmarking CoreML ANE prefill + MLX GPU decode for Qwen3.5 on Apple Silicon, with four inference strategies compared.
- 🔄 ANE prefill matches GPU at **~410 tokens**
- 🔋 **282x** GPU power reduction during prefill
- 📊 4 inference pipelines benchmarked
[[GitHub]](https://github.com/AtomGradient/hybrid-ane-mlx-bench) · [[Paper]](https://atomgradient.github.io/hybrid-ane-mlx-bench/)
---
### [swift-qwen3-tts — On-Device Text-to-Speech](https://atomgradient.github.io/swift-qwen3-tts/)
Native Swift implementation of Qwen3 TTS 0.6B for real-time, on-device speech synthesis.
- 📦 **67%** model compression (2.35 GB → 808 MB)
- 🎙️ Real-time synthesis (**RTF 0.68x**)
- 🌍 12 languages supported
[[GitHub]](https://github.com/AtomGradient/swift-qwen3-tts) · [[Paper]](https://atomgradient.github.io/swift-qwen3-tts/)
---
### [Gemma-Prune — On-Device Vision Language Model](https://atomgradient.github.io/swift-gemma-cli/)
Multi-stage compression pipeline for deploying Gemma 3 4B VLM on consumer hardware.
- 📦 **25%** model compression (2.8 GB → 2.1 GB)
- 📝 **110 tok/s** text generation
- 🖼️ **3.4x** image processing speedup
[[GitHub]](https://github.com/AtomGradient/swift-gemma-cli) · [[Paper]](https://atomgradient.github.io/swift-gemma-cli/)
---
### [OptMLX — MLX Memory Optimization Research](https://atomgradient.github.io/OptMLX/)
Exploring memory optimization techniques for the MLX framework on Apple Silicon.
- ⚡ Up to **20x** faster mmap loading
- 🔄 Zero-copy model loading
- 📊 Comprehensive benchmarks
[[GitHub]](https://github.com/AtomGradient/OptMLX) · [[Paper]](https://atomgradient.github.io/OptMLX/)
---
## About
AtomGradient is an independent research group dedicated to making AI run efficiently on edge devices. Our research powers [EchoStream AI](https://www.echostream-ai.com/) — a product line bringing on-device AI capabilities to real-world applications.
`Edge AI` · `Privacy-First` · `Open Research`