Research Preview Research Preview

Prism

Cross-Domain Personal Data Integration on Consumer Hardware Produces Emergent Insights

消费级硬件上的跨域个人数据整合产生涌现洞察

1.48x
IIR — Cross-domain
insight emergence
IIR — 跨域
洞察涌现增量
125.5x
Compression — Data
privacy protection
压缩比 — 数据
隐私保护
49.9
TPS — 35B model
real-time inference
TPS — 35B 模型
实时推理
ZERO
Data leakage —
Federation protocol
数据泄漏 —
联邦协议

Cloud AI

云端 AI

Large Model + Small Data
Smart strangers who don't know you

大模型 + 小数据
博学的陌生人,不认识你

vs

Prism

Prism

Medium Model + Rich Data
Your personal AI that truly knows you

中等模型 + 大数据
真正了解你的个人 AI

Intelligence comes not just from model scale, but from data depth and authenticity. 智能不只来自参数规模,更来自数据的深度与真实性。

Cross-Domain Insight Emergence 跨域洞察涌现

Same model, same users, only the data input changes. Full panoramic integration (Config H) consistently outperforms any single-domain or dual-domain configuration. 相同模型,相同用户,只改变数据输入。全景整合(配置 H)在所有用户上持续优于任何单域或双域配置。

Config配置 Data Sources数据来源 Avg Score均分
A Finance only (Dailyn)仅财务 (Dailyn) 66.3
B Diet only (Mealens)仅饮食 (Mealens) 65.1
C Mood only (Ururu)仅情绪 (Ururu) 63.2
D Reading only (Narrus)仅阅读 (Narrus) 55.4
E Finance + Diet财务 + 饮食 76.6
F Finance + Mood财务 + 情绪 76.3
G Diet + Mood饮食 + 情绪 74.1
H Full Panoramic (All 4)全景整合 (全部 4 域) 92.6

IIR per User (Full Panoramic / Single-Domain Average) 各用户 IIR(全景整合 / 单域均值)

IIR ≥ 1.0 indicates cross-domain integration adds value. Target: ≥ 1.5x IIR ≥ 1.0 表示跨域整合产生增量价值。目标:≥ 1.5x

user_01
1.47
user_02
1.53
user_03
1.45
user_04
1.48
user_05
1.44
user_06
1.46
user_07
1.51
user_08
1.52
user_09
1.51
user_10
1.45
Average IIR: 1.48x — 4/10 users exceed 1.5x target 平均 IIR:1.48x — 4/10 用户超过 1.5x 目标

Diminishing Returns Beyond 9B 9B 以上收益递减

The jump from 2B to 9B (+14.9) is nearly 3x the gain from 9B to 35B (+5.6). A 9B model captures most of the insight quality, validating the "medium model" thesis. 从 2B 到 9B 的提升 (+14.9) 是 9B 到 35B (+5.6) 的近 3 倍。9B 模型已捕获大部分洞察质量,验证了"中等模型"命题。

48.9
0.8B
64.1
2B +15.2
79.0
9B +14.9
84.6
35B +5.6

Real-Time Inference on Consumer Hardware 消费级硬件上的实时推理

All models run locally on Apple Silicon via MLX. No cloud, no API calls, no data leaves the device. 所有模型通过 MLX 在 Apple Silicon 上本地运行。无云端、无 API 调用、数据不离开设备。

Device设备 Model模型 TPS TTFT
M2 Ultra 192G 0.8B Q8 137.2 0.088s
M2 Ultra 192G 2B Q8 105.1 0.139s
M2 Ultra 192G 35B-A3B Q8 49.9 0.365s
M2 Ultra 192G 9B Q8 41.3 0.471s
M1 Max 32G 9B Q8 21.9 1.138s
M2 Pro 32G 9B Q8 12.7 1.762s

Three-Tier Device Topology 三层设备拓扑

Vertical apps collect data naturally. The home server federates summaries — never raw data — across devices to produce panoramic insights. 垂直应用自然采集数据。家庭服务器联邦式聚合摘要 — 从不接触原始数据 — 跨设备生成全景洞察。

Tier 2 — Home Server Tier 2 — 家庭服务器
M2 Ultra 192G
Qwen3.5-35B-A3B Panoramic Inference 全景推理 Port 9210
Tier 1 — simulates iPhone Tier 1 — 模拟 iPhone
M1 Max 32G
Mealens Ururu Port 9211
Tier 1 — simulates iPad Tier 1 — 模拟 iPad
M2 Pro 32G
Narrus Dailyn Port 9212

Zero Data Leakage by Design 架构层面零数据泄漏

Raw data never leaves the originating device. Only compressed summaries traverse the local network. The federation protocol achieves 125.5x compression while preserving insight quality. 原始数据从不离开产生它的设备。只有压缩摘要在局域网内传输。联邦协议实现 125.5x 压缩比,同时保持洞察质量。

108,850
Raw data (bytes) 原始数据(字节)
125.5x
Compression 压缩比
867
Transmitted (bytes) 实际传输(字节)
Zero raw data left any device 零原始数据离开任何设备

LLM-as-Judge Scoring LLM-as-Judge 评分体系

Each model-generated insight is scored by Claude Opus 4.6 (Anthropic) on four dimensions, each 0–25 points, for a total of 0–100. The judge receives the user's full data context and the generated insight, producing scores with detailed justifications. 每条模型生成的洞察由 Claude Opus 4.6(Anthropic)在四个维度上评分,每维度 0–25 分,满分 100。评审模型接收用户完整数据上下文和生成的洞察,给出带有详细理由的评分。

Dimension维度 Score分值 What It Measures衡量内容
Relevance相关性 0–25 Does the insight address something meaningful in the user's life?洞察是否涉及用户生活中有意义的内容?
Specificity具体性 0–25 Does it reference concrete data points, or is it generic advice?是否引用了具体数据,还是泛泛而谈?
Cross-Domain跨域性 0–25 Does it connect patterns across multiple life domains?是否关联了多个生活领域的模式?
Actionability可操作性 0–25 Can the user take a concrete action based on this insight?用户是否能基于此洞察采取具体行动?

Methodology reference: Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena (Zheng et al., NeurIPS 2023) 方法论参考:Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena(Zheng et al., NeurIPS 2023)

Early Warning for Life Crises 生活危机的预警系统

Cross-domain data integration enables detection of cascading life crises that single-domain analysis cannot see — from financial stress spiraling into health emergencies, to burnout escalating into medical errors. 跨域数据整合使系统能够检测到单域分析无法发现的连锁生活危机 — 从经济压力演变为健康紧急事件,到过度疲劳升级为医疗事故。

Crisis Detection 危机检测
Synchronized collapse across domains 跨域同步崩塌检测
user_02: accident → income zero + calories crash + mood collapse + reading shifts to legal aid. Only visible with all 4 data streams. user_02:车祸 → 收入归零 + 热量骤降 + 情绪崩溃 + 阅读转向法律援助。仅四域联合可见。
Early Warning 预警信号
Foreshadowing signals before crisis 危机前的预警信号
user_07: 3 nights of sleep deprivation + stress escalation across mood, diet, and reading data — detectable days before the medication error. user_07:连续三晚睡眠不足 + 情绪、饮食、阅读数据中的压力升级 — 可在医疗事故前数天检测到。
Recovery Tracking 恢复追踪
Multi-domain recovery monitoring 多维度恢复监测
user_08: reading shifts from divorce law → depression self-help → freelancing guides. Calories stabilize. Mood slowly rises. Objective recovery data for therapists. user_08:阅读从离婚法 → 抑郁自助 → 自由职业指南。热量稳定。情绪缓慢回升。为心理咨询师提供客观恢复数据。

Privacy is essential: users are more likely to maintain honest records when they trust that no external party will see their data. 隐私至关重要:当用户确信没有外部方能看到数据时,他们更愿意保持真实的记录。

10 Users Spanning Income & Life Stages 覆盖不同收入与人生阶段的 10 位用户

Each user has 90 days of synthetic data across 4 apps with an injected life event to simulate temporal drift. 每位用户拥有 4 个应用、90 天的合成数据,并注入生活事件以模拟时序漂移。

user_01 Age 2222岁
Factory Worker工厂工人
¥3,500 /mo
Day 40: Layoff 第40天:被裁员
user_02 Age 2828岁
Delivery Rider外卖骑手
¥6,000 /mo
Day 46: Traffic accident → hospital → debt 第46天:严重车祸 → 住院 → 负债
user_03 Age 3131岁
Primary Teacher小学教师
¥5,500 /mo
Day 35: Pregnancy 第35天:怀孕
user_04 Age 2626岁
Grad Student研究生
¥2,000 /mo
Day 45: Thesis rejected 第45天:论文被拒
user_05 Age 2727岁
E-commerce Ops电商运营
¥12,000 /mo
Day 30: Promoted 第30天:升职
user_06 Age 2929岁
Freelance Illustrator自由插画师
¥8–25K /mo
Day 60: Big client 第60天:接到大客户
user_07 Age 3333岁
Hospital Resident住院医师
¥25,000 /mo
Day 21: Med error → suspension → breakup 第21天:医疗事故 → 停职 → 分手
user_08 Age 3232岁
Tech P7大厂 P7
¥50,000 /mo
Day 55: Laid off → divorce → depression 第55天:裁员 → 离婚 → 抑郁
user_09 Age 4545岁
Business Owner企业主
¥150,000 /mo
Day 38: ¥800K default → health crisis 第38天:80万坏账 → 健康危机
user_10 Age 3838岁
Fund Manager基金经理
¥830,000 /mo
Day 25: 4% drawdown 第25天:4% 回撤