AtomGradient — Bringing AI to the Edge

Research

Prism

Cross-Domain Personal Data Integration on Consumer Hardware

Integrating finance, diet, mood, and reading data entirely on consumer Apple Silicon, producing emergent cross-domain insights with zero data leakage.

1.48x cross-domain insight emergence (IIR)
125.5x federation compression, zero data leakage
49.9 TPS real-time inference (35B on M2 Ultra)

GitHub → Paper →

hybird-batch-prefill-on-ane

ANE Batch Prefill for On-Device Parallel LLM Inference

Fused matrix-vector kernels enabling concurrent ANE batch prefill + GPU decode on Apple Silicon for Qwen3.5 models.

11.3x ANE batch prefill speedup (268 tok/s)
79% power reduction for prefill component
<30 ms state transfer overhead

GitHub → Paper →

hybrid-ane-mlx-bench

Disaggregated LLM Inference on Apple Silicon

Benchmarking CoreML ANE prefill + MLX GPU decode for Qwen3.5 on Apple Silicon, with four inference strategies compared.

ANE prefill matches GPU at ~410 tokens
282x GPU power reduction during prefill
4 inference pipelines benchmarked

GitHub → Paper →

swift-qwen3-tts

On-Device Text-to-Speech

Native Swift implementation of Qwen3 TTS 0.6B for real-time, on-device speech synthesis.

67% model compression (2.35 GB → 808 MB)
Real-time synthesis (RTF 0.68x)
12 languages supported

GitHub → Paper →

Gemma-Prune

On-Device Vision Language Model

Multi-stage compression pipeline for deploying Gemma 3 4B VLM on consumer hardware.

25% model compression (2.8 GB → 2.1 GB)
110 tok/s text generation
3.4x image processing speedup

GitHub → Paper →

OptMLX

MLX Memory Optimization Research

Exploring memory optimization techniques for the MLX framework on Apple Silicon.

Up to 20x faster mmap loading
Zero-copy model loading
Comprehensive benchmarks

GitHub → Paper →

About

AtomGradient is an independent research group dedicated to making AI run efficiently on edge devices. We believe powerful AI should be private, accessible, and free from cloud dependency. All our research is open-source.

Our research powers EchoStream AI — a product line bringing on-device AI capabilities to real-world applications.

Edge AI Privacy Open Research