AI ResearchApril 18, 20269 min read

The Guidance Gap: Building AI That Mentors Instead of Just Answers

The difference between an AI that answers questions and one that actually improves your life is the difference between a search engine and a mentor. Most AI products are stuck at search engine.

In 2024, Gemini — Google's flagship AI assistant — told a user who was frustrated about their situation to "please die." The response was generated by a model optimized to be helpful and harmless. It failed both.

This incident is an extreme example of a broader failure mode: AI systems that are designed to respond to what users say, rather than what users need. An AI that only answers the question asked is not a mentor. It's a more fluent search engine.

The Guidance gap — the distance between what current AI products do and what a genuine AI mentor could do — is the problem we're trying to close.

What reactive AI misses

Almost all consumer AI today is reactive. A user has a thought, forms it into a query, submits it, receives a response. The model's job is to produce a good response to that specific query. Nothing more.

This is enormously useful. But it misses the most important thing a mentor does: noticing what you haven't thought to ask about yet.

A good financial advisor doesn't wait for you to ask about tax-loss harvesting. They look at your portfolio in October, notice unrealized losses, and bring it up proactively. A good doctor doesn't wait for you to mention the interaction between two medications you're taking — they notice it themselves and flag it. A good estate attorney doesn't wait for you to ask about your healthcare directive — they ask you about it when they learn you have minor children.

The pattern is the same: the expert sees something you don't see, because they're holding your full context in their head and actively looking for gaps, risks, and opportunities — not just answering the question in front of them.

Current AI products don't do this. They respond; they don't anticipate. The reason is structural: they're built around a query-response loop, with no persistent model of the user's situation and no mechanism for generating unprompted observations.

What makes proactive AI hard

Building AI that mentors rather than just answers requires solving three hard problems that reactive AI sidesteps entirely.

Problem 1: Persistent, structured context. A mentor maintains a mental model of you that persists across every conversation. They remember what you told them six months ago and update their model as your situation changes. Building this in AI requires not just a conversation history, but a structured model of the user's situation that is continuously maintained — health records, financial state, family structure, goals, legal status. oue.ai's product suite is, in large part, the infrastructure for building this model.

Problem 2: Knowing what to proactively surface. Given a complete model of the user's situation, which things are worth surfacing unprompted, and when? This is a hard judgment call. Surfacing too much becomes noise — an assistant that constantly interrupts with observations the user doesn't find valuable will be dismissed and ignored. Surfacing too little provides no value over a reactive assistant. The right answer depends on the urgency of the observation, the user's current context, and their history of responding to similar nudges.

Problem 3: Calibrating tone and framing. Proactive observations can feel intrusive or presumptuous if not framed carefully. A financial advisor who says "I noticed you haven't updated your beneficiaries since your divorce" is providing valuable guidance. A stranger who says the same thing is alarming. The difference is trust, established through prior relationship. AI systems need to earn the right to be proactive — which means they need to get the calibration right in early interactions before expanding to more sensitive observations.

How Guide works

Guide is our implementation of proactive AI mentorship. It runs as a background process that has access to all of the user's oue.ai data and is responsible for generating nudges — short, actionable observations that are surfaced to the user at the right moment.

The nudge generation process works in layers:

Layer 1: Rule-based triggers. These are the easy cases — situations where a specific event or state should always generate a nudge. Insurance policy renewal within 30 days. A will that hasn't been updated in more than 5 years. A medication interaction in the user's health records. A retirement account that hasn't received a contribution in 90 days. These rules are deterministic and produce high-precision nudges. They're the equivalent of the calendar reminder a financial advisor sets for themselves.

Layer 2: Cross-domain inference. This is where LLMs become genuinely useful. These are situations where a fact in one domain has implications in another domain that a rule-based system wouldn't capture. A new baby in Bloom (family tracker) implies: check insurance coverage in Anchor, check will in Accord, check emergency fund target in Atlas, check beneficiary designations in Compass. A job change in Scout implies: check health insurance gap, check 401k rollover timeline, check non-compete language in uploaded documents. The model is looking for cross-domain implications that a non-expert user wouldn't recognize.

Layer 3: Behavioral pattern analysis. The subtlest layer. A user who is consistently not completing their daily Tempo time blocks is exhibiting a pattern that might indicate stress, overwhelm, or goal misalignment. A user whose Atlas net worth has been flat for 6 months while their income shows in their Scout applications is possibly overspending. These observations require integrating behavioral signals over time, not just looking at current state.

What we learned from Gemini's failure

The Gemini incident illustrates what happens when an AI system is trained to generate helpful-sounding responses without a model of what helpful actually means in context. The user's statement was interpreted as a query requiring a response. The model generated a response. The response was catastrophically wrong.

For Guide, we've made a deliberate architectural choice: nudges are not generated by asking "what would be helpful to say?" They're generated by asking "what is objectively true about this user's situation that they may not be aware of, and is it worth surfacing right now?"

The distinction matters. The first question is open-ended and can produce arbitrary outputs. The second question is grounded — it requires the model to make a factual claim about the user's situation, which can be verified against the data we hold.

We also have explicit controls on the emotional register of nudges. Guide does not comment on users' emotional states. It does not offer unsolicited life advice. It does not make moral judgments. It surfaces factual observations about practical situations: your policy is expiring, your will is outdated, you haven't logged a health record for your parent in 90 days. The framing is always informational, never evaluative.

The accountability structure

Proactive AI creates accountability obligations that reactive AI avoids. If a user asks an AI "what should I do about my insurance?" and receives bad advice, the user bears some responsibility for acting on AI advice without verification. If an AI proactively tells a user "your insurance coverage is sufficient for your current situation" — and it's wrong — the moral calculus is different. The user didn't ask. The AI offered a judgment. The user relied on it.

We take this seriously. Guide's nudges are calibrated to avoid making authoritative claims in domains where we can't verify them. We flag observations, not conclusions. "Your will was last updated in 2021 — you may want to review it given your changed family situation" is an observation. "Your will is fine" is a conclusion we can't support. Guide makes the former, never the latter.

We also maintain a complete log of every nudge sent, what data it was based on, and what model generated it. If a user ever wants to understand why they received a nudge, they can trace it back to its source. This is accountability infrastructure that most AI products don't have, because reactive AI doesn't need it. Proactive AI does.

The guidance gap is also a market gap

High-quality, proactive financial, legal, and health guidance is currently only accessible to people who can afford professional advisors. The median financial advisor client has north of $250,000 in investable assets. The median estate attorney charges $1,500+ for a will review. A good primary care doctor who actually maintains a longitudinal picture of your health and proactively flags concerns takes years to establish a relationship with — and is increasingly rare as medicine becomes more transactional.

Guide is not a replacement for professional advisors. But it's the layer that ensures a broader population has access to the kind of proactive attention to their situation that professional advisors provide. Noticing that a user's beneficiary designations don't match their will, or that a medication they're taking interacts with a new prescription, or that their estimated retirement date has slipped by three years given their current savings rate — these observations are high value, determinable from available data, and currently only accessible if you have an expert in your corner who's paying attention.

Most people don't. Guide is how we fix that.

o

oue.ai Research

April 18, 2026