Research Preview Research Preview

Prism

Cross-Domain Personal Data Integration on Consumer Hardware Produces Emergent Insights

消费级硬件上的跨域个人数据整合产生涌现洞察

1.48x

IIR — Cross-domain
insight emergence IIR — 跨域
洞察涌现增量

125.5x

Compression — Data
privacy protection 压缩比 — 数据
隐私保护

49.9

TPS — 35B model
real-time inference TPS — 35B 模型
实时推理

ZERO

Data leakage —
Federation protocol 数据泄漏 —
联邦协议

Core Thesis 核心命题

Cloud AI

云端 AI

Large Model + Small Data
Smart strangers who don't know you

大模型 + 小数据
博学的陌生人，不认识你

Prism

Medium Model + Rich Data
Your personal AI that truly knows you

中等模型 + 大数据
真正了解你的个人 AI

Intelligence comes not just from model scale, but from data depth and authenticity. 智能不只来自参数规模，更来自数据的深度与真实性。

Experiment A — Ablation Study 实验 A — 消融实验

Cross-Domain Insight Emergence 跨域洞察涌现

Same model, same users, only the data input changes. Full panoramic integration (Config H) consistently outperforms any single-domain or dual-domain configuration. 相同模型，相同用户，只改变数据输入。全景整合（配置 H）在所有用户上持续优于任何单域或双域配置。

Config配置	Data Sources数据来源	Avg Score均分
A	Finance only (Dailyn)仅财务 (Dailyn)	66.3
B	Diet only (Mealens)仅饮食 (Mealens)	65.1
C	Mood only (Ururu)仅情绪 (Ururu)	63.2
D	Reading only (Narrus)仅阅读 (Narrus)	55.4
E	Finance + Diet财务 + 饮食	76.6
F	Finance + Mood财务 + 情绪	76.3
G	Diet + Mood饮食 + 情绪	74.1
H	Full Panoramic (All 4)全景整合 (全部 4 域)	92.6

IIR per User (Full Panoramic / Single-Domain Average) 各用户 IIR（全景整合 / 单域均值）

IIR ≥ 1.0 indicates cross-domain integration adds value. Target: ≥ 1.5x IIR ≥ 1.0 表示跨域整合产生增量价值。目标：≥ 1.5x

user_01

1.47

user_02

1.53

user_03

1.45

user_04

1.48

user_05

1.44

user_06

1.46

user_07

1.51

user_08

1.52

user_09

1.51

user_10

1.45

Average IIR: 1.48x — 4/10 users exceed 1.5x target 平均 IIR：1.48x — 4/10 用户超过 1.5x 目标

Experiment B — Model Scale Curve 实验 B — 模型规模曲线

Diminishing Returns Beyond 9B 9B 以上收益递减

The jump from 2B to 9B (+14.9) is nearly 3x the gain from 9B to 35B (+5.6). A 9B model captures most of the insight quality, validating the "medium model" thesis. 从 2B 到 9B 的提升 (+14.9) 是 9B 到 35B (+5.6) 的近 3 倍。9B 模型已捕获大部分洞察质量，验证了"中等模型"命题。

48.9

0.8B

64.1

2B +15.2

79.0

9B +14.9

84.6

35B +5.6

Experiment C — Inference Benchmark 实验 C — 推理性能基准

Real-Time Inference on Consumer Hardware 消费级硬件上的实时推理

All models run locally on Apple Silicon via MLX. No cloud, no API calls, no data leaves the device. 所有模型通过 MLX 在 Apple Silicon 上本地运行。无云端、无 API 调用、数据不离开设备。

Device设备	Model模型	TPS	TTFT
M2 Ultra 192G	0.8B Q8	137.2	0.088s
M2 Ultra 192G	2B Q8	105.1	0.139s
M2 Ultra 192G	35B-A3B Q8	49.9	0.365s
M2 Ultra 192G	9B Q8	41.3	0.471s
M1 Max 32G	9B Q8	21.9	1.138s
M2 Pro 32G	9B Q8	12.7	1.762s

System Architecture 系统架构

Three-Tier Device Topology 三层设备拓扑

Vertical apps collect data naturally. The home server federates summaries — never raw data — across devices to produce panoramic insights. 垂直应用自然采集数据。家庭服务器联邦式聚合摘要 — 从不接触原始数据 — 跨设备生成全景洞察。

Tier 2 — Home Server Tier 2 — 家庭服务器

M2 Ultra 192G

Qwen3.5-35B-A3B Panoramic Inference 全景推理 Port 9210

Tier 1 — simulates iPhone Tier 1 — 模拟 iPhone

M1 Max 32G

Mealens Ururu Port 9211

Tier 1 — simulates iPad Tier 1 — 模拟 iPad

M2 Pro 32G

Narrus Dailyn Port 9212

Experiment D — Privacy & Federation 实验 D — 隐私与联邦协议

Zero Data Leakage by Design 架构层面零数据泄漏

Raw data never leaves the originating device. Only compressed summaries traverse the local network. The federation protocol achieves 125.5x compression while preserving insight quality. 原始数据从不离开产生它的设备。只有压缩摘要在局域网内传输。联邦协议实现 125.5x 压缩比，同时保持洞察质量。

108,850

Raw data (bytes) 原始数据（字节）

→

125.5x

Compression 压缩比

→

867

Transmitted (bytes) 实际传输（字节）

Zero raw data left any device 零原始数据离开任何设备

Evaluation Methodology 评估方法

LLM-as-Judge Scoring LLM-as-Judge 评分体系

Each model-generated insight is scored by Claude Opus 4.6 (Anthropic) on four dimensions, each 0–25 points, for a total of 0–100. The judge receives the user's full data context and the generated insight, producing scores with detailed justifications. 每条模型生成的洞察由 Claude Opus 4.6（Anthropic）在四个维度上评分，每维度 0–25 分，满分 100。评审模型接收用户完整数据上下文和生成的洞察，给出带有详细理由的评分。

Dimension维度	Score分值	What It Measures衡量内容
Relevance相关性	0–25	Does the insight address something meaningful in the user's life?洞察是否涉及用户生活中有意义的内容？
Specificity具体性	0–25	Does it reference concrete data points, or is it generic advice?是否引用了具体数据，还是泛泛而谈？
Cross-Domain跨域性	0–25	Does it connect patterns across multiple life domains?是否关联了多个生活领域的模式？
Actionability可操作性	0–25	Can the user take a concrete action based on this insight?用户是否能基于此洞察采取具体行动？

Methodology reference: Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena (Zheng et al., NeurIPS 2023) 方法论参考：Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena（Zheng et al., NeurIPS 2023）

Synthetic User Profiles 合成用户画像

10 Users Spanning Income & Life Stages 覆盖不同收入与人生阶段的 10 位用户

Each user has 90 days of synthetic data across 4 apps with an injected life event to simulate temporal drift. 每位用户拥有 4 个应用、90 天的合成数据，并注入生活事件以模拟时序漂移。

user_01 Age 2222岁

Factory Worker工厂工人

¥3,500 /mo

Day 40: Layoff 第40天：被裁员

user_02 Age 2828岁

Delivery Rider外卖骑手

¥6,000 /mo

Day 46: Traffic accident → hospital → debt 第46天：严重车祸 → 住院 → 负债

user_03 Age 3131岁

Primary Teacher小学教师

¥5,500 /mo

Day 35: Pregnancy 第35天：怀孕

user_04 Age 2626岁

Grad Student研究生

¥2,000 /mo

Day 45: Thesis rejected 第45天：论文被拒

user_05 Age 2727岁

E-commerce Ops电商运营

¥12,000 /mo

Day 30: Promoted 第30天：升职

user_06 Age 2929岁

Freelance Illustrator自由插画师

¥8–25K /mo

Day 60: Big client 第60天：接到大客户

user_07 Age 3333岁

Hospital Resident住院医师

¥25,000 /mo

Day 21: Med error → suspension → breakup 第21天：医疗事故 → 停职 → 分手

user_08 Age 3232岁

Tech P7大厂 P7

¥50,000 /mo

Day 55: Laid off → divorce → depression 第55天：裁员 → 离婚 → 抑郁

user_09 Age 4545岁

Business Owner企业主

¥150,000 /mo

Day 38: ¥800K default → health crisis 第38天：80万坏账 → 健康危机

user_10 Age 3838岁

Fund Manager基金经理

¥830,000 /mo

Day 25: 4% drawdown 第25天：4% 回撤

Prism

Cloud AI

云端 AI

Prism

Prism

Cross-Domain Insight Emergence 跨域洞察涌现

IIR per User (Full Panoramic / Single-Domain Average) 各用户 IIR（全景整合 / 单域均值）

Diminishing Returns Beyond 9B 9B 以上收益递减

Real-Time Inference on Consumer Hardware 消费级硬件上的实时推理

Three-Tier Device Topology 三层设备拓扑

Zero Data Leakage by Design 架构层面零数据泄漏

LLM-as-Judge Scoring LLM-as-Judge 评分体系

Early Warning for Life Crises 生活危机的预警系统

10 Users Spanning Income & Life Stages 覆盖不同收入与人生阶段的 10 位用户