
In 2024 University of Illinois researchers found that GPT-4 could autonomously exploit 87% of a one-day dataset of 15 selected vulnerabilities when provided with a common vulnerabilities and exposures (CVE) description. Only 7% can use without description. This provided a “margin of safety” for the industry because while AI could exploit known vulnerabilities, it could not detect them.
But on April 7 Anthropic announced that Claude Mythos Preview closed that margin, the model autonomously discovered thousands of zero-day vulnerabilities across major operating systems and browsers. Separately, Mythos CyberGym scored 83.1% on the vulnerability reproduction benchmark. In a campaign targeting OpenBSD on 1,000 scaffolds, the total computational cost was less than $20,000.
Exploitation periods are falling apart. Langflow’s CVE-2026-33017 (CVSS 9.8) it was exploited 20 hours after it was announced without any public proof. Marimo’s was CVE-2026-39987 (CVSS 9.3). He hit 9 hours and 41 minutes.
The defense infrastructure that most organizations rely on is not designed for this. Rapid7’s 2026 threat landscape report reports that the average time from CVE publication to CISA’s list of known exploitable vulnerabilities (KEV) is five days. Google M-Trends 2026 the report found that the exploit happened before the patch was released. When the Langflow advisory was published, the first exploit came within 20 hours. When the Marimo advisory was published, it took less than 10 hours.
The assumption that your patch window is secure is no longer true because it takes time to deploy. Here are your building blocks.
Just replace the CVSS priority with a three-layer filter
Most frailty management programs still prioritize only by CVSS score. CVSS quantifies the “theoretical” severity of a vulnerability without considering whether the vulnerability is exploitable in the wild or how quickly someone can weaponize it. A CVSS 8.8 vulnerability with an active exploit history (Docker’s CVE-2026-34040) gets a lower priority than the CVSS 9.8 vulnerability, which can never be exploited in the wild.
A recent research It offers a concrete replacement, validated against 28,377 real-world vulnerabilities: a three-layer decision tree that combines CISA KEV status, Exploit Prediction Scoring System (EPSS) scores, and CVSS, thus forming a single prioritization filter.
A Three-Layer Vulnerability Prioritization Filter
|
Layer |
Source of information |
Threshold |
Activity |
SLA |
|
1. Active exploitation |
CISA KEV directory |
Listed |
Instant patch |
Hours |
|
2. Projected exploitation |
EPSS via FIRST.org |
Score ≥ 0.088 |
Upgrade to Tier 0 pipeline |
24 hours |
|
3. Degree of severity |
CVSS via NVD |
Score ≥ 7.0 |
Typical recovery |
On politics |
Proven result: 18x efficiency gain, 85.6% coverage of exploited vulnerabilities, ~95% reduction in emergency recovery workload. All three data sources are open and free.
The described integration can be fully automated. It is possible to build CISA KEV API, EPSS API from FIRST.org, and script for query. NVDand have this script run against your asset inventory for each published CVE. The person in this process should stay in the loop as a confirmer, not as a trigger.
Close the agent permission loophole
The rapid creation of exploits changes not only the prioritization of patches, but also how controls are configured for all systems managed by an agent that now has privileged credentials. Your authorization policies have not been evaluated against the behavior of AI agents, and this is now a quantifiable risk. CVE-2026-34040 revealed that Docker’s authorization plugin architecture silently bypasses every plugin when the request body exceeds 1MB. Common AuthZ plugins (OPA, Casbin, Prisma Cloud) are unaware of this type of bypass, which occurs in Docker’s middleware before the request reaches the plugin.
When Cyera demonstrated this weaknessthey showed that the debugging infrastructure of an AI agent can identify a detour path when performing a legitimate task without being instructed to use anything.
The Internet Engineering Task Force (IETF) is working on authorization models for agents. Document project-klrc-aiagent-auth-01AWS offers the use of the current Secure Production Identity Framework for Everyone (SPIFFE) and OAuth 2.0 for AI agents to obtain dynamically provisioned and short-lived credentials, published in March by Zscaler, Ping Identity, and OpenAI participants.
Separately, the IETF Draft Agent Identity Protocol (draft-prakash-aip-00) reports that none of the nearly 2,000 model context protocol (MCP) servers examined were authenticated.
However, the implementation of these standards is months and years away. For now, security teams must proactively integrate agent-level test scenarios for all authorization boundaries, such as large requests, burst frequency, and multi-step escalation of privileged requests.
Map your credential blast radius
a Survey conducted by CSA/Zenity and in a report published on April 16, 53% of organizations said they had already seen cases of AI agents exceeding intended permissions, and 47% had experienced a security incident involving an agent.
When AI builders like Flowing (CVE-2025-59528, CVSS 10.0), Langflow or n8n crashes, the blast radius extends far beyond the host. These tools include API keys for boundary models, database credentials, vector store tokens, and OAuth tokens for business systems. A dangerous AI builder is not just a system breach. It is a credential product that unlocks authenticated access to each connected service.
Without credential dependency maps for each AI tool owner, the incident response to agent compromise is likely. For each instance, document each credential, its inclusion rate, and the corresponding credential rotation process. Also start migrating static API keys to short-lived tokens where downstream services allow.
Five activities for this quarter
1. Place the three-layer KEV-EPSS-CVSS filter
Replace only the CVSS prioritization according to the table above. Automate data collection from all three APIs as part of a scheduled script against your asset inventory. Desired result: 18x more efficient, 85.6% coverage of exploited vulnerabilities, 95% reduction in emergency recovery workload.
2. Perform event-based patching for Level 0 services.
Determine which services fall under critical exposure: Internet users, AI build hosts, and services with direct exposure to the container orchestration control plane. Instead of waiting for the next maintenance window for this tier, run an event-based patch on the CVE release.
Goal: Deploy a patch to a canary within four hours of a CVE being declared critical. Use CISA KEV and EPSS tapes to trigger event-based patching. In cases where the four-hour patch goal cannot be met due to legacy dependencies, window-switching-freeze, or rollback risk, immediately apply compensatory controls such as removing internet exposure to the vulnerable service, rotating credentials for the vulnerable service, disabling affected service functionality (if possible, until a patch can be determined). placed.
Allowing unlimited exposure for extended periods of time while waiting for a maintenance window is not acceptable.
3. Agent-wide trial authorization limits.
Create test cases for each API that AI agents can communicate with through AuthZ policies. In particular, include test cases for requests with body sizes greater than 1MB, 5MB, and 10MB. It includes test cases for burst rate > 100 requests and test cases for unusual parameter combinations (privileged flags, host mounts, capability additions). In addition, Docker Engine patch 29.3.1 To fix CVE-2026-34040.
4. Credential blast radius mapping for all AI builder hosts.
Document each credential for each Langflow, Flowise, n8n, and custom AI pipeline instance. Classify each credential by its lifetime (static key and short-lived token). Determine what each credential can include. Set alerts for anomalous IP or identity for any credential access.
5. Shadow AI discovery scan for this week.
According to CSA data, there is a greater than 50% chance that your agents will exceed their expected limits. Check Security Information and Event Management (SIEM) and network monitoring tools for connections to AI builder’s default ports: Langflow 7860, Flowise 3000, and n8n 5678. Any unauthorized instances are an uncontrolled attack surface.
Package
AI agents appear, etcit meets standards bodies. The IETF has numerous drafts on agent authentication and authorization. The Coalition for Safe AI published MCP Security taxonomy and Secure-by-Design principles.
But these standards move at standards-body speed, and the window of operation is now measured in hours. Organizations implementing triple-layer filtering and event-based patching this quarter will see a measurable reduction in exposure. Waiters will run calendar-based patch cycles against an opponent operating in less than 20 hours.
Nick Kale is a senior engineer specializing in enterprise AI platforms and security





