TDPel Media News Agency

OpenAI launches new benchmark to test AI agents on crypto smart contract security in collaboration with Paradigm and OtterSec

Temitope Oke
By Temitope Oke

OpenAI has unveiled a new benchmark designed to test how well AI models can detect, patch, and even exploit vulnerabilities in crypto smart contracts.

The initiative, called EVMbench: Evaluating AI Agents on Smart Contract Security, was released in partnership with crypto investment firm Paradigm and security specialists OtterSec.

Its goal is to understand how AI could operate in real-world environments where billions of dollars are at stake.


AI Agents Face 120 Curated Smart Contract Vulnerabilities

The benchmark draws on 120 vulnerabilities identified from 40 smart contract audits, many sourced from open-source audit competitions.

AI agents were tasked with spotting and exploiting these vulnerabilities in a controlled, economically meaningful setting.

The results were eye-opening: Anthropic’s Claude Opus 4.6 led the pack with an average “detect award” of $37,824, followed by OpenAI’s OC-GPT-5.2 at $31,623, and Google’s Gemini 3 Pro at $25,112.

OpenAI noted that smart contracts are increasingly significant in the financial world, securing billions in crypto assets.

Evaluating AI in this context is essential because these models could be transformative for both attackers and defenders.


The Future of AI-Driven Crypto Transactions

Tech leaders are already imagining a future where AI agents handle crypto payments on behalf of users.

Circle CEO Jeremy Allaire predicted that within five years, billions of AI agents could be transacting stablecoins for everyday payments.

Former Binance CEO Changpeng “CZ” Zhao also suggested that crypto could become the “native currency” for AI agents, streamlining financial operations without human intervention.

Haseeb Qureshi, managing partner at Dragonfly, highlighted why this development might be necessary.

He explained that smart contracts, while revolutionary, were never designed for human intuition, making large transactions nerve-wracking due to risks like drainer wallets.

Spread the News. Auto-share on
Facebook Twitter Reddit LinkedIn

Temitope Oke profile photo on TDPel Media

About Temitope Oke

Temitope Oke is an experienced copywriter and editor. With a deep understanding of the Nigerian market and global trends, he crafts compelling, persuasive, and engaging content tailored to various audiences. His expertise spans digital marketing, content creation, SEO, and brand messaging. He works with diverse clients, helping them communicate effectively through clear, concise, and impactful language. Passionate about storytelling, he combines creativity with strategic thinking to deliver results that resonate.