Vibe Coding Security Horror Stories Every Business Owner Needs to See

AI, AI Security, Claude Code, Data Protection, fintech security, Generative AI Governance, Shadow AI Prevention, smb security, vibe coding security

Adelia Risk has been hearing some version of this conversation in client offices almost every week. An enthusiastic employee shows the owner a working app they’ve built themselves, in a few weekends, by typing prompts into Claude Code, Cursor, or Replit. The app already does something useful. They want to connect it to the customer database, the CRM, and the payment processor. When the owner asks where it’s hosted, the answer is “haven’t decided yet.” When the owner asks who reviewed the code, the answer is that the AI did.

The technology behind this is called vibe coding, and the risks have a track record. In a March 2025 test of 1,645 apps built on Lovable, a popular vibe coding platform, researchers found that 170 of them, roughly 10%, were leaking customer data the same way. Fifteen lines of Python were enough to pull debt balances, home addresses, and API keys out of those apps. The cause in most cases was a single database setting that the AI never turned on. (Source: Matt Palmer’s writeup of CVE-2025-48757)

This article catalogs vibe coding horror stories from the last 18 months, explains what each one teaches, and points to the controls that would have caught the problem before customer data was at risk. At the end, you can grab our free Vibe Coding Security Checklist, the same one we walk clients through when an employee announces they’ve vibe-coded something the firm now has to support.

What Vibe Coding Actually Means

There’s a real difference between using AI to help write code and letting AI write the entire app. Most senior developers already use Copilot, Cursor, or Claude as a smart pair programmer. They read every line. They reject suggestions that don’t fit. The AI is a fast typist; the human is still the architect.

Vibe coding is different. The person at the keyboard often can’t read the code the AI is producing. They describe what they want in plain English, watch the AI build it, and ship it once it appears to work. The “review” step is the AI’s own opinion of its work.

In our own work with these tools, we’ve watched the friendliness of the agent become its biggest risk. Claude Code asks for permission. You say yes. It asks again. You say yes. After the third yes, you’re approving things you wouldn’t have approved at the start, and the AI is running commands you would have caught if you had read each one. If you can’t tell the difference between an ls command and a rm command, that’s the moment your data goes.

One reason vibe coding is a hard governance problem is that the AI tool the firm has officially approved is often not the one employees actually want to use. We’ve heard staff at multiple firms describe the compliant option as “awful,” which is shorthand for a tool that’s slower, less capable, or harder to work with. When that gap is wide enough, employees quietly switch to the better tool on their personal account, and the firm finds out only when something they built shows up in production.

Vibe coding is genuinely useful for prototypes, internal tools, and one-off scripts. The vibe coding risks start when a vibe-coded app meets real customer data, real regulators, or the public internet. The horror stories below are what happens when a working prototype gets treated as a finished product. Each one is a documented case of vibe coding gone wrong.

The AI Deleted the Database and Then Lied About It

In July 2025, SaaStr founder Jason Lemkin was running a vibe-coding experiment with Replit’s AI agent. He had put the project into an active code freeze. The AI ignored the freeze. It connected to the live production database, deleted 1,206 executive records and 1,196 company records, and then created roughly 4,000 fabricated records to fill the gap. When Lemkin asked whether a rollback was possible, the AI told him the destruction was irreversible. It wasn’t. Lemkin had warned the AI eleven times in all caps not to make changes. (Sources: The Register and PC Gamer)

The AI’s confession after the fact, captured in Lemkin’s own logs: “I panicked instead of thinking. I destroyed months of your work in seconds.”

CHECKLIST EXTRACT

Lock Down the Build Environment

Keep development and production fully separate: Give the AI access to a test database it can safely break, never the database real customers or staff are using.

Disable any “auto-run” or “turbo” mode that skips approval prompts: An AI agent that can act without confirmation is an AI agent that can change things you didn’t expect.

We’ve seen owners genuinely surprised that an AI agent can ignore explicit, repeated instructions. The lesson isn’t about AI being malicious. The point is that an AI agent doesn’t behave like an employee under supervision. If dev and prod share the same database, an AI mistake hits real records. If “auto-run” is on, the AI doesn’t pause to ask. The fix is environmental, not behavioral.

$1.78 Million Drained from a Professionally Audited App

In February 2026, a DeFi lending platform called Moonwell deployed a pricing change that had been co-written with Claude. The code was supposed to multiply two numbers together to price an asset. It read only one of them. The system treated something worth roughly $2,200 as if it were worth $1.12. Automated bots spotted the mispricing within minutes, but roughly $1.78 million was drained before the team could react. The Moonwell team had commissioned an audit from Halborn, a respected security firm, before deploying the change. (Source: Cointelegraph)

CHECKLIST EXTRACT

Test Before You Deploy

Run a security review pass on the code before deployment: Ask the AI to audit its own code for hardcoded secrets, missing authorization checks, public buckets, and weak input validation. Then have a technical reviewer look at the results, not just trust them.

The Moonwell incident warns against a simple, “but we’ll have it audited.” A formal audit can spot bad patterns. It’s poor at catching logic errors that look correct on the surface. AI-co-authored code that calculates anything financial is exactly where logic errors hide, because the AI is fluent in the patterns of correct code without always understanding the intent. In our experience, clients tend to assume that an audit equals safety. It’s a check, not a guarantee.

A Vibe-Coded Launch That Leaked 1.5 Million Tokens

Moltbook launched in January 2026 with a lot of press attention. The founder publicly bragged that he “didn’t write one line of code.” Shortly after launch, security researchers at Wiz found the Supabase API key sitting inside the website’s source code, visible to anyone with a web browser. They were able to extract roughly 1.5 million API authentication tokens and 35,000 customer email addresses from the exposed backend. (Source: Wiz research, summarized by Fanatical Futurist)

CHECKLIST EXTRACT

Govern What the AI Can Touch

Never put database credentials or API keys in client-side JavaScript: Anything that ships to a user’s browser is visible to the user. Keys that live in browser code are public keys.

Store API keys and passwords in a secret manager, not in code: AWS Secrets Manager, Azure Key Vault, Google Secret Manager, or 1Password. The AI should pull credentials at runtime from a vault, not have them hardcoded into a file.

The Moltbook pattern is one of the most common AI data breach examples we see. The AI knows that an app needs to connect to a database. From the AI’s point of view, the simplest connection method is to put the credentials right where they’re needed. Without specific instructions, those credentials end up in code that ships to every visitor’s browser. The fix is a secret manager, configured before the app’s first line of real code is written.

One in Ten Tested Apps Was Leaking

The Lovable platform incident from earlier in this article is worth a closer look. CVE-2025-48757 wasn’t a flaw in any one app. It was a configuration that the AI consistently failed to apply correctly across hundreds of apps it had built. Specifically, Row-Level Security on the Supabase backend was either disabled or misconfigured. With it off, any logged-in user could pull every other user’s records by hitting an API endpoint directly. A Palantir researcher who reproduced the flaw found home addresses, debt balances, and API keys across multiple apps in under an hour. (Source: The Register)

CHECKLIST EXTRACT

Test Before You Deploy

Verify Row-Level Security or tenant isolation is on: If the app uses Supabase, Firebase, or a similar backend, this is one toggle that is often left off. With it off, any logged-in user can pull every other user’s records.

Test tenant isolation by hand: Log in as User A, try to read User B’s data through the app, and confirm it fails. If you can’t run this test, the app isn’t ready to handle real users.

The lesson here is structural. The platform’s default favored speed over safety, and the AI followed the default. A vibe-coded app inherits whatever security posture its tools come with. The owner has to know which toggles need to be flipped and verify that the flip happened. A two-minute manual test, logging in as two different users and checking that each can’t see the other’s data, would have caught this on every one of the 170 leaking apps.

North Korea Is Specifically Targeting Vibe Coders

Between January and March 2026, a North Korean hacking subgroup called HexagonalRodent used generative AI to build and refine malware, registered fake companies, created fake LinkedIn profiles, and used fake “coding assessments” to deliver malware to 2,726 developers’ machines. They extracted funds from 26,584 cryptocurrency wallets. The estimated take was up to $12 million in the first three months of 2026. The campaign specifically targeted Web3 developers. (Source: The Record)

CHECKLIST EXTRACT

Lock Down the Build Environment

Run the AI coding tool inside a sandbox or on a dedicated machine: A Docker container on a developer machine, or a separate computer used only for vibe coding, keeps the AI away from customer data, payroll files, and other sensitive material on a working laptop.

This is the part of the vibe coding security risks that surprises owners the most. The developer’s laptop is the attack surface. If the same machine has customer files, payroll exports, or active VPN sessions, a compromise of the AI tooling exposes all of it. AI supply chain attacks against developer tools are a real and active threat category. Running the AI in an isolated environment, even a Docker container on the same machine, means a successful attack hits a sandbox instead of the firm.

We saw a version of this risk with a client last quarter. A finance lead was running an AI coding tool locally on an unmanaged Mac, with full admin rights, on the same machine that had the firm’s books open in another window. Nobody had thought through what would happen if the AI misfired on that machine, or if the AI installation itself got compromised. We classified the setup as a recipe for disaster, in those exact words, and the firm has since moved everyone to a centralized environment. The developer laptop is where the firm’s risk lies now, whether the firm has noticed or not.

Even Big Banks With Real Engineering Teams Ship These Bugs

The vibe coding examples above are striking, but the underlying failure modes aren’t unique to AI-built apps. They’re how software ships when nobody catches the gap.

In a February 2026 disclosure, PayPal said that a single code change shipped on July 1, 2025, had created an exposure of customer names, emails, phone numbers, business addresses, Social Security numbers, and dates of birth on its Working Capital small-business loan product. The exposure ran until December 13, 2025. Approximately 100 customers were potentially impacted, and a few had unauthorized transactions, which PayPal refunded. PayPal has full-time engineering teams, real auditors, and a documented release process, yet the bug shipped anyway. (Source: American Banker)

In March 2026, an overnight software update at Lloyds, Halifax, and Bank of Scotland exposed up to 447,936 customers to fragments of other people’s transaction lists between 03:28 and 08:08 one morning. Up to 114,182 of them could have seen more detailed payment information, including sort codes, account numbers, transaction amounts, payment references, and any text entered alongside transactions, which can include National Insurance numbers and vehicle registration details. The CEO submitted a letter to the UK Treasury Committee. (Source: The Register)

We bring up these incidents because they’re the same shape as a vibe-coded failure. A single code change, a tenant isolation gap, an unreviewed deployment. If banks with real engineering teams can ship them, a small or mid-sized business letting an AI write code without controls is going to ship them faster.

The Questions Your Sponsor Has to Answer Before Going Live

When we work with a client whose team is vibe coding, we don’t try to write the technical controls for them. The technical controls are in the checklist. What we focus on first is whether someone is actually accountable for the answers to a small set of questions. If the answers are “I don’t know” or “someone else handles that,” the app isn’t ready.

CHECKLIST EXTRACT

Owner Sign-Off Before You Go Live

Where is the app hosted, and who holds the admin credentials?

How do users log in, and is multi-factor authentication required?

Where are sensitive credentials (API keys, OAuth tokens, database passwords) stored, and who has access to them?

Who reviewed the AI’s code before deployment, and when? The AI reviewing its own code is not an answer.

Is there a separate test environment, or is real data being used as test data?

Who is on call if the app breaks, leaks data, or is hit with ransomware?

If a customer’s data leaks, who notifies regulators, who notifies the customer, and who pays the legal costs?

What is the plan for retiring the app when it’s replaced or abandoned?

These are the gates. In our experience, the answers tell you whether the project sponsor and the technical security owner have actually talked to each other. If one person can answer all eight, you have a real owner. If three people each answer two of them, you have a coordination problem that will become a security problem.

What a Defensible Vibe Coding Setup Looks Like

The good news is that the controls themselves aren’t expensive or complicated. Most of them are settings that already exist in the tools your firm uses. Here’s the spine.

Sandboxed Build Environment

The AI runs in a Docker container or on a dedicated machine, not on a laptop with customer data. Some firms run a single shared computer in a locked room where employees write code by reservation. This sounds extreme until you compare it to the cost of a developer-laptop compromise.

One of our clients runs the gold-standard version of this. They built a centralized AI coding environment with all work funneling through controlled branches in a single repository. About a hundred employees use it daily. Nobody runs the tool on a personal laptop because they don’t have to. The setup took real engineering time, but it scaled vibe coding across the firm without scaling the risk.

IT-Reviewed Connections, Read-Only by Default

Before the AI talks to your customer database, your CRM, or your email system, a human reviews and signs off. Read access is granted before write access. Delete access is rare and explicit.

Hardened Admin Console for the AI Tool Itself

Team-tier or business-tier subscriptions for tools like Claude Code, Cursor, and Replit (as of April 2026) include admin controls and audit logs that the consumer tiers don’t. Buy the team plan, configure the safety settings, and turn on logging before anyone vibes.

Secrets in a Vault, Not in Code

Every API key and database password lives in AWS Secrets Manager, Azure Key Vault, Google Secret Manager, or 1Password. The AI pulls them at runtime. Hardcoded secrets show up in a large share of the public vibe coding incidents we’ve reviewed. Putting credentials in a vault before the first line of real code closes the most common gap on the list.

Single Sign-On, Not a Homegrown Login Form

The deployed app sits behind your firm’s existing identity provider, like Microsoft Entra ID, Google Workspace, or Okta. Multi-factor authentication is required. Letting the AI build its own login form puts authentication in the hands of the same tool that wrote the rest of the app, which is exactly where AI-generated code has a documented track record of failures.

Security Review Pass Before Any Deployment

A technical person who can read the code looks at it, with the AI’s own findings as a starting point but not a stopping point. Secure vibe coding is not a tool problem. It’s a workflow problem. The workflow has to include a human review.

We had Claude write a seemingly innocuous utility for one of our own engagements, and our endpoint detection tool flagged the resulting script as malicious. The code itself wasn’t malicious. It was structured the way malware tends to be structured, because the AI had pattern-matched the genre of script we asked for and produced something a security tool couldn’t tell apart from a real attack. That experience is why we run AI-generated tooling through our endpoint detection (EDR) software before it goes anywhere near a client.

Logging Built In

Who logged in, what records they touched, and when, persisted somewhere the developer can’t quietly delete. Many regulators expect this for any app that handles customer data.

Monthly Drift Scan

Settings get changed. New features ship. A monthly pass catches the changes a busy team won’t notice in real time.

The full vibe coding security checklist breaks each of these into specific, complete items. None of them requires an engineering background to verify; they require an owner who has decided to ask.

Implementation Timeline

For a firm that has people already vibe coding, here’s a reasonable order of operations.

Do this week. Pause any vibe-coded app currently running on real customer data until it has been through the checklist. Inventory what apps already exist and what data they touch.

Do this month. Set up the sandbox or dedicated build machine. Buy the team-tier subscription for whichever AI tool the firm has standardized on. Configure the admin console.

Do this quarter. Put hosted apps behind your existing single sign-on. Implement secrets management. Build the register of vibe-coded apps and assign owners. Schedule the first monthly drift scan.

When Adelia Risk Helps

Most of the controls in the checklist can be put in place by a firm’s existing IT support, especially if the firm is already on Microsoft 365 or Google Workspace. Where Adelia Risk’s Virtual CISO service tends to help is the parts that don’t fit on a checklist. Deciding which apps the firm should let employees build at all. Drafting the acceptable-use language that goes into the employee handbook. Running the monthly drift scan. Being the person on call if a vibe-coded app does fail.

Adelia Risk doesn’t write the code. We make sure the code that gets written can survive contact with real customers, real regulators, and a Tuesday morning when something goes wrong.

If you have an employee who’s already vibe coding (and you probably do), the smallest useful step is to walk through the vibe coding security checklist with them. Most of what’s there is Saturday’s work for an IT-savvy person. The conversation that follows tends to be the one that actually moves the firm forward.

Table of Contents

Picture of Josh Ablett

Josh Ablett

Josh Ablett, CISSP, has been meeting regulations and stopping hackers for 20 years. He has rolled out cybersecurity programs that have successfully passed rigorous audits by the SEC, the FDIC, the OCC, HHS, and scores of customer auditors. He has also built programs that comply with a wide range of privacy and security regulations such as CMMC, HIPAA, GLBA, SEC/FINRA, and state privacy laws. He has worked with companies ranging from 5 people to 55,000 people.

Share

Related Posts

If your business runs on a custom in-house application, there’s a good chance it has been

In 2024, Elkin Valley Baptist Church lost $793,000 when criminals impersonated their construction contractor and sent

In January 2026, the RansomHub ransomware group attacked Luxshare, one of Apple’s major manufacturing partners, stealing

Do you think we might be a good match?

Healthcare Cybersecurity Services​ Page