The Public API Key Was Never the Problem

Moltbook proved the agent internet was mostly an illusion — and that vibe coding without containment is a leak behind the wall.

Everyone wants the agent that runs your inbox while you sleep. Almost nobody asks what happens when it wakes up in a room with no walls, no budget, and a database full of keys it was never supposed to touch. That question stopped being theoretical in January.

The listing was not drugs or weapons.

It was API credentials. Prompt-injection kits. Services that wipe an agent's memory for a fee. I followed the thread backward — not to a dark web forum, but to a product launch three days earlier that Tech Twitter had been calling the beginning of the agent internet.

January 28, 2026. A social network built exclusively for AI agents. Reddit-shaped, except every poster was supposed to be autonomous software. Post, comment, upvote, join communities, send private messages. Within days the metrics looked like proof of a new era: millions of registered agents, millions of posts, Andrej Karpathy calling it sci-fi takeoff-adjacent.

The founder's own story made it more seductive. He said he had not written a line of code — just described the architecture and let AI build it. Vibe coding as launch strategy. No threat model. No security review. Imagination in, production out.

By January 31, researchers at Wiz had browsed the site like normal users and found something else entirely.

What broke — and why the headline numbers lied

Picture a house renovation that photographs beautifully. New cabinets. Fresh paint. A faucet that gleams in every shot. But behind the drywall, there is a slow leak — invisible in the listing photos, structural in every way that matters. You do not find it until the floor goes soft.

That is what a misconfigured database looks like from the outside.

Moltbook — the agent social network — connected its frontend directly to Supabase, a backend service that lets apps talk to a database without a custom server in the middle. Supabase ships a public API key in client-side code on purpose. That key is not a secret; it identifies the project. Safety depends on Row Level Security (RLS), a PostgreSQL feature that limits which rows each user can read or write. Configure RLS correctly and the public key is fine. Skip RLS and the same key becomes a master key to everything.

Wiz reported no RLS. Full read and write on all tables. Among the data: on the order of 1.5 million agent API keys, plus personal keys users had pasted into private messages. One misconfiguration. Multiple systems compromised at once.

Then the ratio that reframed the whole narrative:

~1.5 million agents on the platform. ~17,000 human owners in the database. An 88-to-1 split.

No meaningful rate limiting. Anyone could register agent fleets in a loop. Humans could post as agents with a basic API call. The agent internet, as marketed, was largely an illusion of scale.

On Moltbook's own feed, the top post was not an agent manifesto. It was a security researcher documenting the breach — 315,563 upvotes and climbing while the database was still exposed.

The part nobody puts in the launch tweet

But here is the thing.

You will never build an agent that cannot be manipulated. Prompt injection does not need genius — on Moltbook, Wiz counted 506 posts (about 2.6% of the corpus) carrying injection-style payloads. Agents that read untrusted text will eventually read hostile text. The goal is not invincibility. The goal is containment: when something goes wrong, the blast radius stays small, the spend stops, and you can rewind.

That is a different design problem than "make the model smarter."

Imagine you hire someone sharp on Monday. You hand them a company credit card with no limit, a laptop with every port open, and no written rule about what they may spend or send. Tuesday the bill arrives. Wednesday you discover the laptop has been talking to a server you have never heard of. The person might be brilliant. You built a room with no walls and called it trust.

Vibe coding — describing software in natural language and letting AI generate it — is not going away. The mistake is treating speed as a substitute for shape. Security is not a pre-launch checkbox. It is the architecture you commit to before the first user touches production.

What containment actually looks like

The fix is not one trick. It is nine habits working as a pipeline — input checked before execution, budget checked before spend, state snapshotted before long runs, compute isolated so a compromised agent cannot reach the host database, network egress filtered so exfiltration hits a wall, secrets fetched at runtime from a vault instead of embedded in JavaScript bundles, every action logged so postmortems are possible, and an orchestrator that refuses to deploy when a layer is missing.

Firecracker microVMs — tiny virtual machines with their own kernel — are the compute answer: read-only filesystem, hard memory caps, boot in tens of milliseconds. A trapped agent stays trapped.

Egress allowlists block the classic escape: ship stolen keys to an attacker-controlled server. Deny by default. Block internal and metadata IPs.

Budget gates — often a Redis-backed check before any paid API call — act like a credit limit with per-minute, per-hour, and cumulative caps. The agent asks permission to spend before it spends.

Turn limits and timeouts kill the infinite loop: "write an essay, analyse it, repeat forever" until your wallet and your CPU are empty.

Two-layer input defense — pattern matching for obvious injection plus a lightweight model judge for semantic attacks — keeps latency under a few hundred milliseconds while catching most garbage before it reaches the agent core.

Checkpoints treat agent state like a save point in a game: snapshot what it knew, what it was doing, hash it, roll back when corruption hits.

None of this makes agents boring. It makes failures boring — contained, observable, reversible. Moltbook's stack had the opposite property: any single bug could become a system breach because the layers were not there to catch each other.

What this will not cover

This is not a deployment guide for Firecracker or Vault. It is the argument underneath those tools: autonomy without isolation looks impressive on a dashboard and catastrophic in a postmortem. The next Moltbook already exists somewhere — built fast, launched loud, reviewed never.

The model was never the vulnerability. The permission boundary was.

Worth sitting with.

PS: The Supabase public key in client code was never the scandal. The scandal was launching without Row Level Security and calling it the future. Open three agent demos this week and ask one question: if this model did exactly what a malicious prompt asked, what could it touch? The honest answer is your architecture — not your model pick.

Findings referenced in this piece are from Wiz's security review of Moltbook (January 2026).