AI Agents Approaching Real DeFi Exploitation Risk, Research by Sam Winkler and Anthropic Team Shows

The latest findings from the Anthropic Fellows program present a sobering reality: artificial intelligence models have crossed a critical capability threshold and can now autonomously identify and exploit vulnerabilities in smart contracts with measurable precision. Research led by teams including contributors like Sam Winkler demonstrates that frontier AI models aren’t just theoretical threats—they’re already capable of mounting attacks that rival the sophistication of human-directed exploits in decentralized finance.

The implications of this shift extend far beyond academic concern. As these AI systems become cheaper to deploy and more advanced in their reasoning, the economics fundamentally reshape the threat landscape for every blockchain and software system relying on publicly visible, monetizable vulnerabilities.

Frontier Models Successfully Execute Full Attack Scenarios

A collaborative study by the ML Alignment & Theory Scholars Program (MATS) and Anthropic Fellows tested cutting-edge models—GPT-5, Claude Opus 4.5, and Sonnet 4.5—against SCONE-bench, a comprehensive dataset containing 405 previously exploited smart contracts. The results were stark: these models didn’t simply flag problematic code. They synthesized complete, executable exploit scripts, sequenced transactions strategically, and drained simulated liquidity pools in patterns that mirror actual attacks on Ethereum and BNB Chain.

The collective output was particularly striking: $4.6 million in simulated exploits targeting contracts that existed after the models’ knowledge cutoffs. This figure matters because it suggests a lower bound on what current-generation AI could theoretically steal if deployed against live systems today.

Zero-Day Discovery Proves Autonomous Vulnerability Detection Works

The real breakthrough came when researchers tested whether AI agents could identify previously unknown vulnerabilities. GPT-5 and Sonnet 4.5 scanned 2,849 newly deployed BNB Chain contracts with no history of prior compromise. The models discovered two previously unknown flaws that converted into $3,694 in simulated profit.

The first vulnerability stemmed from a missing view modifier in a public function—a subtle oversight that allowed the agent to artificially inflate its token balance. The second flaw created a vector for fee redirection by accepting an arbitrary beneficiary address. In both instances, the AI models generated functional exploit code that converted these design weaknesses into immediate monetary gain.

What makes this discovery particularly significant is the economics: running the autonomous agent across the entire contract set cost only $3,476, with an average cost of $1.22 per execution. This efficiency benchmark becomes critical when evaluating future threat scenarios.

The Economics of Automated Attacks Keep Improving

The real story isn’t about today’s attack costs—it’s about tomorrow’s trajectory. As AI model expenses decline and tool-use capabilities mature, the cost-benefit calculus tilts decisively toward full automation. Research suggests this shift will compress the window between smart contract deployment and potential exploitation, especially in DeFi environments where capital sits openly on-chain and profitable bugs can be monetized in seconds.

The current economics already tip the scales. When running an autonomous exploit agent costs just over $1 per contract, and potential gains can reach thousands of dollars, the incentive structure attracts bad actors at scale. As model costs approach zero, even exploits yielding modest returns become attractive targets for automation.

Beyond DeFi: Vulnerabilities in Broader Infrastructure

While the research focuses specifically on decentralized finance, the underlying capabilities aren’t domain-specific. The reasoning patterns that allow an agent to manipulate token balances or redirect fees can translate directly to conventional software, closed-source codebases, and the infrastructure supporting crypto markets more broadly.

As model costs continue falling and tool integration improves, automated vulnerability scanning will inevitably expand beyond public blockchains. The same AI capabilities could target any service or system offering a path to valuable assets—from centralized exchange infrastructure to bridge protocols to institutional custody solutions.

The Defense-Offense Timeline Becomes Critical

The research frames these findings as a warning rather than a forecast. AI models can now perform tasks historically requiring elite human attackers. Autonomous exploitation in DeFi is no longer hypothetical—it’s a present capability gap. The urgent question facing crypto builders and infrastructure teams is whether defensive technologies can advance at pace with offensive AI capabilities.

The window for moving from reactive to proactive security is narrowing. Developers must prioritize automated security scanning, formal verification, and resilient contract design patterns before AI agents make the economics of exploitation irresistible at scale.

ETH-3,4%
BNB-0,59%
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
0/400
No comments
  • Pin

Trade Crypto Anywhere Anytime
qrCode
Scan to download Gate App
Community
  • 简体中文
  • English
  • Tiếng Việt
  • 繁體中文
  • Español
  • Русский
  • Français (Afrique)
  • Português (Portugal)
  • Bahasa Indonesia
  • 日本語
  • بالعربية
  • Українська
  • Português (Brasil)