The Vulnerability Window Just Closed

By Sean Hsieh, Founder & CEO, Runline

Last Friday, one of the largest hosting platforms on the internet disclosed that attackers had stolen customer data, environment variables, API keys, database credentials, and signing keys. The platform was Vercel — the company behind Next.js, used by OpenAI, Cursor, Pinterest, and millions of developers.

The kill chain is worth reading slowly:

A Context.ai employee downloads Roblox auto-farm scripts on a personal device. Gets infected with Lumma Stealer malware.
Malware harvests the employee’s credentials — Google Workspace, Supabase, Datadog, Authkit.
A Vercel employee had previously signed up for Context.ai’s “AI Office Suite” using their corporate email and granted “Allow All” OAuth permissions to their Google Workspace.
Attackers use the compromised Context.ai OAuth tokens to hijack the Vercel employee’s account.
Lateral movement into Vercel’s internal systems. Every customer environment variable not manually flagged as “sensitive” — stored in plaintext — is now in the attacker’s hands.
ShinyHunters lists the data on BreachForums for $2 million.

One employee. Three companies. Zero zero-days. The attacker didn’t exploit a novel vulnerability in any software. They exploited a boolean default — environment variables in Vercel were “non-sensitive” unless someone manually toggled them — and the fact that one person at one vendor clicked “Allow All” on an OAuth prompt for an AI productivity tool.

A security researcher summarized it as cleanly as anyone could: “The lesson isn’t ‘don’t download Roblox cheats.’ It’s ‘don’t design systems where one compromised employee token gives read access to every credential that wasn’t manually flagged.’”

If you run a credit union, you might think this is a Silicon Valley problem. It’s not. Every institution in your network has employees using AI productivity tools. Every one of those tools asked for OAuth permissions. Every one of those permissions is a potential bridge from a compromised vendor into your systems. The question isn’t whether the bridge exists. It’s whether anyone audited it.

The Compliance Layer Was Fake

Here’s where the story gets worse.

Context.ai — the company whose breach led to Vercel being compromised — held a SOC 2 Type II certification. That certification was issued by Delve, a $32 million YC-backed startup that sold “AI-native compliance automation.”

Delve is now accused of fabricating compliance reports at industrial scale. An anonymous whistleblower analyzed 494 SOC 2 reports issued through Delve’s platform. 493 of them — 99.8% — used identical boilerplate text, including the same grammatical errors. Pre-written conclusions existed before clients submitted evidence. The platform auto-generated fake board meeting minutes with placeholders, risk assessments with a default of exactly 10 risks, and fabricated evidence documentation.

Y Combinator removed Delve from its directory and asked the founders to leave the program. The founders called it “a targeted cyberattack” and “a coordinated smear campaign.”

Read the loop: the compliance provider that certified the company whose breach compromised Vercel was itself running fake compliance. The SOC 2 badge that was supposed to signal “this vendor is safe to trust with your data” was auto-generated boilerplate that nobody at Delve had meaningfully reviewed.

I’ve been building and deploying AI agents inside regulated financial services since 2023 — first at Concreit, an SEC-regulated investment platform I founded, and now at Runline, where I build AI infrastructure for credit unions. I sit in vendor evaluation calls with CU executives every week. I understand the regulatory impulse to check boxes — and I also understand that SOC 2 is the primary mechanism credit unions use to evaluate vendor security posture. When a board member asks “is this vendor secure?”, the answer is almost always “they have a SOC 2.” The Delve scandal reveals what that answer is actually worth when the certification itself can be fabricated at scale by a tool designed specifically to generate passing documentation.

This doesn’t mean SOC 2 is worthless. It means SOC 2 is necessary but nowhere near sufficient. If your vendor evaluation process begins and ends with “do they have a SOC 2?”, you’re building your security posture on a foundation you haven’t verified. Ask who conducted the audit. Ask to see the report — not just the badge. Ask whether the auditor evaluated the vendor’s actual controls or reviewed auto-generated documentation.

The credit unions that survive the next five years will be the ones that treat compliance attestations as a starting point for investigation, not a substitute for it.

The AI Escalation: From Bug Reports to Working Exploits

The Vercel breach didn’t require AI. It required a malware-infected laptop and a permissive OAuth scope. That’s the terrifying part — the current threat landscape is already this fragile, and the AI escalation hasn’t fully arrived yet.

But it’s arriving.

Six weeks before the Vercel breach, Anthropic’s Claude Opus 4.6 found 22 vulnerabilities in Firefox over two weeks — 14 high-severity, patched in Firefox 148. It detected a use-after-free bug in the JavaScript engine after just 20 minutes of exploration. The red team then tested whether Claude could go further: not just find bugs, but write exploits. Of roughly 350 attempts costing $4,000 in API credits, Claude produced working exploits in two cases. Both only functioned in a stripped-down test environment.

On April 7, Anthropic announced Claude Mythos Preview — a model built specifically for autonomous vulnerability discovery. The gap between Opus 4.6 and Mythos isn’t incremental. It’s a phase change:

271 Firefox vulnerabilities found and patched in Firefox 150 (vs. 22)
181 working exploits produced (vs. 2)
A 27-year-old bug in OpenBSD — an OS that stakes its reputation on security
A 17-year-old remote code execution in FreeBSD’s NFS server (CVE-2026-4747) granting full root access
Non-security engineers asked Mythos to find RCE vulnerabilities overnight. Working exploits were waiting the next morning.

The UK AI Safety Institute evaluated Mythos independently. It’s the first AI model to complete a 32-step corporate network attack simulation — from reconnaissance through full network takeover — succeeding 3 out of 10 attempts. On expert-level CTF challenges, it succeeds 73% of the time. The AISI noted that the simulations lacked live defenders and endpoint detection — meaning Mythos demonstrated capability against weakly-defended systems. That caveat describes most credit union networks today.

Anthropic locked Mythos inside Project Glasswing, a consortium of roughly 40 organizations — AWS, Apple, Cisco, CrowdStrike, Google, JPMorgan Chase, Microsoft, Palo Alto Networks — committed to applying it defensively. Anthropic pledged $100 million in usage credits and $4 million to open-source security. Mythos will not reach general availability.

Picus Security coined the name for the dynamic: the Glasswing Paradox. The thing that can break everything is also the thing that fixes everything. The capability is out of the bottle. The question is who applies it first — the defenders scanning their own systems, or the attackers scanning yours.

As of April 2026, Claude’s CyberGym benchmark score doubled in four months. The exploit success rate went from near-zero to 181 working exploits in six weeks. The asymmetry between finding and exploiting vulnerabilities still favors defenders — finding a vulnerability requires less capability than weaponizing it at scale against hardened targets. But the gap shrank from astronomical to single-digit in six weeks. Defenders still have the structural advantage. The question is how fast it’s eroding — and that’s why the next 90 days matter.

The Pattern: AI Tools as Attack Surface

The Vercel breach, the Mercor supply-chain attack (4TB of data stolen from a $10B AI training startup via a poisoned PyPI package), and the Delve compliance scandal share a common thread that matters enormously for credit unions: the AI tools your organization adopts are themselves attack surfaces.

A security researcher mapped the escalation pattern across April:

April 14: Individual MCP server compromised (MCPwn, CVSS 9.8)
April 15: Platform-level MCP integration compromised (Atlassian, unauthenticated RCE — remote code execution without credentials)
April 19: AI tool OAuth chain used as lateral movement vector (Context.ai → Vercel)
April 21: The MCP protocol SDK itself found vulnerable

Each step up is a larger blast radius. Individual servers. Then platforms. Then the integration protocol itself. The attack surface isn’t just expanding — it’s climbing the abstraction stack.

Google published a taxonomy of attacks against AI agents in April — six categories: prompt injection, context manipulation, tool misuse, identity spoofing, memory poisoning, and orchestration hijacking. This matters for credit unions because you’re not just defending your infrastructure from AI-powered attacks. You’re also defending the AI systems you deploy from being weaponized against you.

The compliance Runner that triages BSA alerts is a potential attack surface. The member service agent that processes inquiries is a potential attack surface. Every AI system you deploy extends your security perimeter. The control architecture — agent identity verification, trust tiers, automated kill switches, immutable audit logs — is the same whether you’re preventing an accidental terraform destroy or a deliberate intrusion. I covered the full architecture in “The terraform destroy Heard Round the World” — the zero-trust control plane for AI agents. The design principles are the same.

Why This Hits Credit Unions Differently

I’ve been building AI agents for regulated financial institutions since 2023 — first inside an SEC-regulated platform, now deploying them at credit unions. What I’ve seen inside their technology stacks keeps me up at night — not because credit unions are negligent, but because the threat model just changed and the infrastructure hasn’t.

Your employees are using AI tools with broad OAuth permissions right now. The Vercel breach happened because one employee at one vendor granted “Allow All” to an AI productivity tool. How many of your employees have done the same? How many AI tools have access to your Google Workspace, your Microsoft 365, your Slack? If you don’t know, you’re in the same position Vercel was in before the breach — except you don’t have Mandiant on retainer.

Your core processor was designed for a different threat. Symitar, Corelation, DNA, Sharetec — these platforms were architected 15 to 30 years ago. Their security models assumed attackers were humans with limited time and specific targets. An AI that can reason about exploit chains doesn’t care that the vulnerability is in code paths nobody has reviewed in 15 years — that’s the best place to look. None of the major core processors have publicly announced AI-powered security audits of their platforms. That doesn’t mean they’re not happening — but if they are, the results aren’t reaching the institutions that depend on them.

Your compliance framework assumes a window that no longer exists. NCUA’s Information Security program expectations reference patch management cycles and vulnerability scanning intervals calibrated for the old world — the one where you had 30 to 90 days between disclosure and weaponization. SANS, CSA, and OWASP convened an emergency strategy briefing over a single weekend in April with 60+ contributors and 250+ CISO reviewers. Their central finding: that window has collapsed from weeks to hours. When the window is hours, quarterly vulnerability scans are like checking your smoke detectors once a season.

Your vendor’s SOC 2 might be worthless. The Delve scandal showed that compliance attestations can be fabricated at scale — 493 of 494 reports using identical boilerplate. If your vendor evaluation process begins and ends with “they have a SOC 2,” you’re building security on a foundation you haven’t verified.

Security staffing is a structural crisis. (ISC)² estimates a global cybersecurity workforce gap exceeding 4 million professionals. The average credit union under $1 billion in assets has one dedicated IT security person — maybe. One person cannot defend against AI-speed attacks using manual-speed tools. It’s not a performance issue. It’s arithmetic.

Your Context Is Your Shield

Here’s the argument nobody else is making, and it’s the reason I’m writing this article instead of pointing you to a SANS briefing.

Your institutional knowledge — the same context layer I described as the moat that actually matters — is a security asset.

An AI attacker has broad capability but zero institutional context. It doesn’t know that your credit union processes merchant deposits differently on Fridays. It doesn’t know that your ACH batch runs at 3:47 AM, not midnight. It doesn’t know that your branch in Tacoma has a dedicated line to a commercial member that generates high-volume legitimate traffic that looks anomalous to a generic monitoring system.

A defender AI with deep institutional context can distinguish between a legitimate anomaly and an attack. Consider the scenario: an attacker AI probes your ACH batch window at 3:47 AM and generates 200 test transactions to map your processing logic. A generic intrusion detection system flags the volume 45 minutes later — after the reconnaissance is complete. A context-aware security agent recognizes that the timing matches your batch window but the source IPs don’t match your batch processor, the transaction patterns don’t match your known commercial members, and the probing signature is consistent with automated reconnaissance.

It isolates the connection in under a second.

That’s the same context layer that makes your compliance Runners effective at BSA triage. The institutional knowledge that lets an agent say “Maria’s flower shop always deposits $8,000 in cash on Monday mornings — that’s not structuring” is the same knowledge that lets a security agent say “this traffic pattern at 3:47 AM doesn’t match any known batch process — isolate it.” Understanding your members, your patterns, your operational rhythms is as valuable for threat detection as it is for compliance.

Every cycle through the self-healing loop — every false positive correctly identified, every real threat caught early, every correction that refines the model — makes the system smarter. The institutions that have been building context layers for operational AI are accidentally building the foundation for AI-native security. The ones that haven’t started have to build both simultaneously.

This is why the compounding curve I described in “Your Vendor Still Writes Code by Hand” applies to security as directly as it applies to engineering velocity. The credit union that starts building an institutional context layer today has 1,500 learning cycles by next year. The one that waits has zero. The gap between them isn’t linear — it’s exponential. And in security, the gap is measured in breaches prevented, not just productivity gained.

Five Questions for This Quarter

Your next NCUA exam may or may not include questions about AI-era threats. Don’t wait to find out.

1. How many third-party AI tools have OAuth access to our Google Workspace or Microsoft 365 — and what scopes did we grant?

The Vercel breach happened through a single “Allow All” OAuth grant to an AI productivity tool. Audit every third-party integration. Revoke any scope broader than what the tool strictly requires. If your IT team doesn’t have a current inventory of OAuth grants, that absence is the finding.

2. When was the last time our infrastructure was assessed by an AI-powered security tool — not a signature-based scanner?

Traditional scanners check for known CVEs using signature databases. AI-powered assessment reasons about code and configuration the way Mythos reasons about source code — looking for novel vulnerability patterns, not just known ones. Ask your managed security provider whether their tool can identify novel vulnerability patterns in code it has never seen, or only match against known signatures. If the answer is signatures only, your assessment is already outdated.

3. How fast can we isolate a compromised system from the rest of our network?

Time the answer. If it involves a human making a phone call, logging into a console, and manually reconfiguring a firewall rule — that’s minutes at best, hours at worst. The answer should be automated, policy-driven, and measurable in seconds. Every system in your institution that interacts with external networks needs a control layer — whether you build it, buy it, or contract it — that can monitor in real time, respond automatically, isolate instantly, and log immutably.

4. Do we have network segmentation between our core processor, online banking, card management, and other critical systems?

If the answer is “they’re on the same VLAN,” you have a lateral movement highway. The 32-step network attack that Mythos completed worked because the simulation lacked segmentation. Your core is the crown jewel. Segment it from everything else. Authenticate, authorize, and log every system-to-system connection — even internal ones.

5. Who conducted our vendors’ SOC 2 audits, and have we read the actual reports?

After Delve, “they have a SOC 2” is no longer sufficient due diligence. Ask to see the report — not just the badge. Ask whether the auditor evaluated actual controls or reviewed auto-generated documentation. If your board has non-technical members — and most do — frame this as an examination risk: the question is whether your vendor evaluation process will satisfy the examiners who will be asking these questions in 18 months.

What to Do in the Next 90 Days

The Mythos moment and the Vercel breach aren’t reasons to panic. They’re reasons to act — with the same urgency that SANS, CSA, and 250 CISOs felt when they produced an emergency briefing over a weekend.

Audit every AI tool integration. Inventory every OAuth grant, every API connection, every third-party tool that touches your systems. Revoke anything with “Allow All” scope. Move secrets out of platform environment variables and into dedicated vaults — Doppler, AWS Secrets Manager, whatever your MSP supports. Inject at runtime, not at rest.
Start building your institutional context layer — or recognize that you already are. Every correction your BSA team makes to an AI triage recommendation, every exception your lending team documents, every pattern your operations team validates is training data for a security model that can distinguish “normal for us” from “attack.” This is the compounding defensive advantage that late starters cannot purchase — and the one asset an attacker can never replicate.
Demand an AI-powered security assessment from your managed security provider. Not a checkbox scan. A real assessment that reasons about your infrastructure. Ask them to name the product they use and whether it’s signature-based or model-based. If your provider can’t deliver this, start evaluating providers who can.
Run a tabletop exercise that simulates a multi-stage attack. Not a phishing drill — a scenario where an AI-speed attacker chains three vulnerabilities across three systems in 90 seconds. When your team realizes the attack moves faster than their playbook, the conversation about automated response becomes urgent instead of theoretical.
Brief your board with facts, not scare tactics. The Vercel kill chain is a one-slide story any board member can understand: Roblox cheat → malware → stolen OAuth token → customer data. The Delve scandal is a one-question diagnostic: do we know who actually audited our vendors’ SOC 2 reports? Frame the investment ask around examination exposure, not architectural jargon.

The window has collapsed. The compliance layer can be faked. The tools your employees adopted last quarter are the entry point.

The credit unions that survive this will be the ones whose AI knows them better than any attacker’s AI ever could.

Sean Hsieh is the Founder and CEO of Runline, an AI operations platform purpose-built for credit unions. He previously founded Concreit, an SEC-regulated real estate investment platform, and served in leadership roles at Flowroute (acquired by Intrado). He can be reached at sean@runlineai.com.