AI Agents Demonstrate Functional Exploit Development Capabilities

AI Agents Demonstrate Functional Exploit Development Capabilities


AI Agents Demonstrate Functional Exploit Development Capabilities

Researchers from UC Berkeley, Max Planck Institute, and AI labs released ExploitGym, testing whether frontier AI models can convert vulnerabilities into working exploits. Testing 898 real-world CVEs, Anthropic's Mythos Preview exploited 157 instances while OpenAI's GPT-5.5 managed 120 within two-hour windows. Both models frequently weaponized entirely different vulnerabilities than those initially provided, with Mythos deviating from intended bugs in 69 of 226 CTF scenarios.

Agents successfully bypassed ASLR and V8 sandbox protections. While GPT-5.5's safety filters blocked 88% of requests, researchers note such guardrails remain bypassable through prompt engineering.

️ Open sources - closed narratives

@sitreports

Source: Telegram "sitreports"

Report Page