OpenAI, in collaboration with crypto investment firm Paradigm and security firm OtterSec, has unveiled a groundbreaking benchmark designed to evaluate the effectiveness of AI models in detecting and mitigating vulnerabilities in crypto smart contracts. The EVMbench: Evaluating AI Agents on Smart Contract Security paper, released on Wednesday, highlights the critical role AI can play in both attacking and defending these digital agreements.
AI Agents Go Head-to-Head in Security Challenges
The benchmark tested 120 curated vulnerabilities from 40 smart contract audits, pitting some of the most advanced AI models against each other. Anthropic’s Claude Opus 4.6 emerged as the top performer, achieving an average detect award of $37,824, followed by OpenAI’s OC-GPT-5.2 and Google’s Gemini 3 Pro, with $31,623 and $25,112, respectively.
Why It Matters
Smart contracts, which are self-executing agreements with the terms directly written into code, secure billions of dollars in assets. As these contracts become more prevalent, the risk of security breaches grows. OpenAI emphasizes that AI agents are likely to play a transformative role in both attacking and defending these contracts, making it crucial to assess their capabilities in economically meaningful environments.
The Growing Threat Landscape
The need for robust security measures is underscored by the recent surge in crypto-related thefts. In 2025, attackers managed to steal $3.4 billion worth of crypto funds, a slight increase from the previous year. This highlights the urgency of developing and deploying AI-driven security solutions to protect against such threats.
Future Implications
The rise of AI in crypto transactions is not just a technological advancement but a paradigm shift. Circle CEO Jeremy Allaire predicts that within five years, billions of AI agents will be handling stablecoin transactions on behalf of users, making everyday payments more efficient and secure. Similarly, former Binance CEO Changpeng ‘CZ’ Zhao believes that crypto will become the native currency for AI agents.
Expert Insights
In a post on X, Dragonfly Capital’s managing partner Haseeb Qureshi highlighted the inherent risks and complexities of crypto transactions. He noted that while smart contracts were not designed for human intuition, AI-intermediated, self-driving wallets could revolutionize the way we interact with these digital assets. Qureshi believes that AI agents will help mitigate the risks associated with large transactions and manage complex operations seamlessly.
Conclusion
The launch of EVMbench marks a significant step forward in the development of AI-driven security solutions for smart contracts. As the crypto landscape continues to evolve, the integration of advanced AI models will be crucial in ensuring the integrity and security of these digital agreements. The future of crypto transactions is likely to be shaped by the capabilities of AI agents, making it essential for developers and security experts to stay ahead of the curve.
