--- title: README emoji: โšก colorFrom: indigo colorTo: purple sdk: static pinned: false license: mit --- # AtomGradient โ€” Bringing AI to the Edge **We are an independent research group dedicated to making AI run efficiently on edge devices.** We believe powerful AI should be private, accessible, and free from cloud dependency. All our research is open-source. ๐ŸŒ [atomgradient.com](https://atomgradient.com) ยท ๐Ÿ™ [GitHub](https://github.com/AtomGradient) ยท ๐Ÿš€ [EchoStream AI](https://www.echostream-ai.com/) --- ## Research ### [Prism โ€” Cross-Domain Personal Data Integration on Consumer Hardware](https://atomgradient.github.io/Prism/) Integrating finance, diet, mood, and reading data entirely on consumer Apple Silicon, producing emergent cross-domain insights with zero data leakage. - ๐Ÿ“ˆ **1.48x** cross-domain insight emergence (IIR) - ๐Ÿ”’ **125.5x** federation compression, zero data leakage - โšก **49.9 TPS** real-time inference (35B on M2 Ultra) [[GitHub]](https://github.com/AtomGradient/Prism) ยท [[Paper]](https://atomgradient.github.io/Prism/) --- ### [ANE Batch Prefill โ€” On-Device Parallel LLM Inference](https://atomgradient.github.io/hybird-batch-prefill-on-ane/) Fused matrix-vector kernels enabling concurrent ANE batch prefill + GPU decode on Apple Silicon for Qwen3.5 models. - ๐Ÿš€ **11.3x** ANE batch prefill speedup (268 tok/s) - ๐Ÿ”‹ **79%** power reduction for prefill component - โฑ๏ธ **<30 ms** state transfer overhead [[GitHub]](https://github.com/AtomGradient/hybird-batch-prefill-on-ane) ยท [[Paper]](https://atomgradient.github.io/hybird-batch-prefill-on-ane/) --- ### [hybrid-ane-mlx-bench โ€” Disaggregated LLM Inference on Apple Silicon](https://atomgradient.github.io/hybrid-ane-mlx-bench/) Benchmarking CoreML ANE prefill + MLX GPU decode for Qwen3.5 on Apple Silicon, with four inference strategies compared. - ๐Ÿ”„ ANE prefill matches GPU at **~410 tokens** - ๐Ÿ”‹ **282x** GPU power reduction during prefill - ๐Ÿ“Š 4 inference pipelines benchmarked [[GitHub]](https://github.com/AtomGradient/hybrid-ane-mlx-bench) ยท [[Paper]](https://atomgradient.github.io/hybrid-ane-mlx-bench/) --- ### [swift-qwen3-tts โ€” On-Device Text-to-Speech](https://atomgradient.github.io/swift-qwen3-tts/) Native Swift implementation of Qwen3 TTS 0.6B for real-time, on-device speech synthesis. - ๐Ÿ“ฆ **67%** model compression (2.35 GB โ†’ 808 MB) - ๐ŸŽ™๏ธ Real-time synthesis (**RTF 0.68x**) - ๐ŸŒ 12 languages supported [[GitHub]](https://github.com/AtomGradient/swift-qwen3-tts) ยท [[Paper]](https://atomgradient.github.io/swift-qwen3-tts/) --- ### [Gemma-Prune โ€” On-Device Vision Language Model](https://atomgradient.github.io/swift-gemma-cli/) Multi-stage compression pipeline for deploying Gemma 3 4B VLM on consumer hardware. - ๐Ÿ“ฆ **25%** model compression (2.8 GB โ†’ 2.1 GB) - ๐Ÿ“ **110 tok/s** text generation - ๐Ÿ–ผ๏ธ **3.4x** image processing speedup [[GitHub]](https://github.com/AtomGradient/swift-gemma-cli) ยท [[Paper]](https://atomgradient.github.io/swift-gemma-cli/) --- ### [OptMLX โ€” MLX Memory Optimization Research](https://atomgradient.github.io/OptMLX/) Exploring memory optimization techniques for the MLX framework on Apple Silicon. - โšก Up to **20x** faster mmap loading - ๐Ÿ”„ Zero-copy model loading - ๐Ÿ“Š Comprehensive benchmarks [[GitHub]](https://github.com/AtomGradient/OptMLX) ยท [[Paper]](https://atomgradient.github.io/OptMLX/) --- ## About AtomGradient is an independent research group dedicated to making AI run efficiently on edge devices. Our research powers [EchoStream AI](https://www.echostream-ai.com/) โ€” a product line bringing on-device AI capabilities to real-world applications. `Edge AI` ยท `Privacy-First` ยท `Open Research`