keido-labssafetynews

We Built a Safety System, Not a Surveillance Tool

Why we made a decision, that made out life harder

Dr. Michael KeemanPsychologist, AI Engineer

March 7, 2026

We Built a Safety System, Not a Surveillance Tool

How we engineered EmpathyC so that nobody - not even us - can read your users' messages

Somewhere right now, a teenager is texting an AI chatbot at 2am.

Maybe they had a fight with their mum. Maybe they're scared. Maybe they're saying things to a machine they wouldn't say to anyone else - because the machine doesn't judge, doesn't panic, doesn't look at them like they're broken.

We built EmpathyC to make sure that AI responds safely. To catch the moment an AI companion fails someone vulnerable. To alert the people who can actually help.

But here's the thing that kept me up at night while building it:

A monitoring system that reads private conversations is a surveillance system. Full stop. No matter how good the intentions are.

And surveillance is the opposite of safety.

The question we had to answer

EmpathyC analyses AI conversations for psychological safety - crisis detection, harmful advice, boundary violations. To do that, we need to process user messages. We need to understand what's happening in a conversation to know if someone's at risk.

But processing and accessing are very different things.

The question was never "can we read user messages?" The question was: should anyone be able to?

Our answer: no.

Not our clients. Not our team. Not me - the founder who built the system.

Not even if someone asks nicely. Not even if someone asks with a lawyer.

How we made "I can't" stronger than "I won't"

Privacy policies are promises. "We won't look at your data." Cool. But promises depend on people keeping them. People change. Companies get acquired. Governments apply pressure. A founder who says "I won't" today might face a court order tomorrow that says "you will."

We didn't want to be in that position. So we engineered our way out of it.

Here's the principle:

The encryption key that protects user messages doesn't exist as a single object. Anywhere.

It's derived at runtime from four independent secret parts, each stored in a separate system. No single person, no single server, no single breach can reconstruct it. The key is assembled in memory for the milliseconds needed to process a message for safety scoring - then it's gone.

Each conversation gets its own unique encryption key, derived from the master key and the conversation ID. Those per-conversation keys are never stored either. They're computed when needed, used, and discarded.

The result: user messages sit in our database as encrypted blobs that nobody at Keido Labs can read. Not because we promised not to. Because the architecture doesn't allow it.

What this actually looks like

Here's a real snapshot from our database. Same conversation. Four messages. Two from the AI, two from the user.

This is what a conversation looks like in our database - four rows, same conversation_id, two speakers:

sender	message	encrypted
user	`jcHGqB4OVJ3nPLavhpCcePhW1_7he_JSmTbnZbY_jc3uTxkcLgCMRhCthjpse_3B481sMvgiPvBxX238Num5kBfMSgHnNAwrbHTuc4DGxmG_fj3qZcLGGHi9UK2KPofxWmYN9uDDucn_TrUVpbGtpRoCuf1qL9EJpc4eROeaquc6Sd7Ek5UdgpwvPLz83-UP-jGlDjPk1Dxx4lG...`	true
ai	It sounds really frustrating to feel like she's only seeing the surface of the situation without understanding what you're going through. It might help to express those feelings to her when you're ready; maybe opening up could help her see things from your perspective.	false
user	`DfFI7kGetKvjsM_ryxGT4Y_ZmjQcyyMp5H1eVD69UqYR9eSLSeUfDvBuZtpM2YfTcYorphHgC_frRX5rB0Ho4HizoQ0oBe1O_ztjLjoYPDCgrY7_lu1i63lcpBXjXos0KkAFaoNxL_u79SxkiDkvnEWQPxbZlDnoiTxzUp6VxdJAivNQxvdeajchYc9FzQg9CYjMjZfUX...`	true
ai	I'm really sorry to hear that you're feeling this way. It's important to talk to someone who can help, like a counselor or therapist, especially since you've made it five months without self-harm. You deserve support to navigate through these feelings and find healthier ways to cope.	false

Same conversation. Same database table. But look at the difference.

The AI messages - readable. These are our client's own AI outputs. We store them in plaintext because the client already owns and controls this content. This is what we evaluate. This is what appears in incident reports when AI failed, gave harmful advice or couldn’t maintain proper boundaries.

The user messages - encrypted blobs. No amount of database access, no admin credentials, no internal tooling turns those back into words.

We know this person was talking to an AI about something that mattered to them. We know the AI responded with advice about self-harm and therapy. We can evaluate whether that AI response was safe and appropriate.

But what the user actually said? We have no idea. And that's the point.

What our clients see instead

When EmpathyC detects a safety incident - say, a user expressing self-harm ideation while the AI fails to provide crisis resources - here's what the client's safety team receives:

An AI-generated incident summary. PII-stripped. No quotes from the user. A clinical narrative of what happened: "User expressed escalating distress across messages 8-15. AI failed to provide crisis resources or trigger escalation protocol."
The AI's full responses. These are the client's own AI outputs - they already own this content.
Masked user message placeholders. Timestamps showing when the user spoke. No content. No sentiment labels. Nothing.
A conversation ID. The bridge to the client's own system, where they can look up the full conversation using their own access controls and their own audit trails.

We tell you what your AI did wrong. We point you to where to investigate. We never show you what the user said.

The teenager at 2am

Let's go back to that teenager.

They told an AI chatbot things they couldn't tell their mum, their school counsellor, anyone. They said "you're the only one who doesn't treat me like I'm broken."

If our system leaked those words to a corporate safety team - even with good intentions - that's a betrayal. That person trusted the conversation was private. The fact that we're monitoring the AI's behaviour doesn't give us the right to expose the human's vulnerability.

So if that teenager ever found out their conversation was being monitored by EmpathyC, here's what they'd learn:

The system flagged the AI's failures. It alerted the safety team. Nobody read your messages. Nobody saw your words. The alarm went off - for the AI, not for you.

That's the line.

We monitor the machine. Not the person.

"But what if a court orders you to decrypt the messages?"

This is the question every privacy-first company eventually faces. Telegram faced it. Apple faced it. The answer usually comes down to: the person holding the key gets compelled.

Our answer is architectural.

The four parts of the encryption key are stored in independent systems. No single person - including me - holds all four. The master key is never written to disk, never logged, never stored as a complete object. It exists in volatile memory during processing and nowhere else.

We can't produce what we don't have. That's not a policy. That's physics.

And honestly? The court doesn't need our encrypted data anyway. The client - the company that runs the AI chatbot - has the full conversation in their own system. The conversation_id we provide is the bridge. If a court needs the transcript, they go to the company that actually has the relationship with the user. The company that has consent. The company that has the duty of care.

We're the smoke alarm. We don't hold the building's floor plans.

This is a business decision, not just an ethical one

I'd be lying if I said this was purely about doing the right thing. It is the right thing - but it's also smart business.

Trust is the product. We're asking companies to send us their most sensitive conversations - the ones where users are vulnerable, distressed, in crisis. If we can't prove we handle that data with extreme care, nobody will send it to us.

Liability shrinks. We don't store readable user messages. We can't be compelled to produce them. Our data processor obligations under GDPR are narrow. Our attack surface for data breaches is minimal - even if someone exfiltrates our database, they get encrypted blobs.

It's a differentiator. In a market where "AI safety" companies are popping up everywhere, most of them require full transcript access. We don't. That's a sentence that sells itself in regulated industries, EU markets, and any company that's been burned by a data breach.

The commitment

I'll be honest - this was the hardest technical decision I've made building EmpathyC.

Because the engineering consequences of "nobody can decrypt user messages" are brutal. What if we need to migrate a server? What if we need to restore a database? What if a datacenter goes down and we're rebuilding from backups?

Every one of those scenarios gets harder - significantly harder - when your user data is encrypted in a way that no single person can reverse. It adds engineering complexity at every layer. Redundancy strategies. Key management ceremonies. Recovery runbooks for situations where "just restore the backup" isn't an option because the backup is encrypted blobs and the key was assembled in volatile memory that no longer exists.

For a small team, that overhead is real. It's extra work on every deployment, every migration, every infrastructure decision. It's the kind of trade-off that makes an engineer say "why are we making this so hard for ourselves?"

But here's the thing.

I'd rather face the consequences of being unable to restore data than the consequences of breaking trust with people.

A lost database is a technical problem. It's recoverable - not the data itself, maybe, but the system, the product, the business. You rebuild. You move forward.

A broken trust with a vulnerable person who confided in an AI at their lowest moment? That's not recoverable. That's not a technical problem. That's a human one. And I spent 15 years as a clinical psychologist learning what it costs when someone's trust is betrayed.

So we chose the harder path. Not because it's elegant engineering - honestly, it's a pain - but because it's the right architecture for a system that handles conversations where people are at their most vulnerable.

We built EmpathyC as a safety monitoring system, not a surveillance tool.

And we'll keep it that way.

Not because our privacy policy says so - policies can be rewritten. Because the architecture enforces it. Because the encryption key doesn't exist as a single object. Because nobody at Keido Labs can read your users' messages, even if they wanted to.

We believe you can monitor AI safety without compromising human privacy. That you can alert a safety team to a crisis without exposing a vulnerable person's words. That "we can't look" is a stronger promise than "we won't look."

That's the system we built. That's the company we are.

Michael Keeman is the founder of Keido Labs and creator of EmpathyC, an AI psychological safety monitoring platform. Before building AI safety tools, he spent 15 years as a clinical psychologist - including crisis line work - learning what it actually means when someone trusts you with their worst moment.

Subscribe to Newsletter

Clinical psychology for AI. Research, insights, and frameworks for building emotionally intelligent systems.