• About
  • FAQ
  • Landing Page
Newsletter
  • Home
    • Home – Layout 1
    • Home – Layout 2
    • Home – Layout 3
  • Bitcoin
  • Ethereum
  • Regulation
  • Market
  • Blockchain
  • Business
  • Guide
  • Contact Us
No Result
View All Result
  • Home
    • Home – Layout 1
    • Home – Layout 2
    • Home – Layout 3
  • Bitcoin
  • Ethereum
  • Regulation
  • Market
  • Blockchain
  • Business
  • Guide
  • Contact Us
No Result
View All Result
No Result
View All Result
Home Guide

Can AI Agents Boost Ethereum Security? OpenAI and Paradigm Created a Testing Ground

admin by admin
February 18, 2026
in Guide
0
OpenAI Adds Custom ChatGPT to Pentagon Platform as Expert Warns of Risks
192
SHARES
1.5k
VIEWS
Share on FacebookShare on Twitter



In brief

  • EVMbench tests AI agents on 120 real-world Ethereum smart contract vulnerabilities.
  • Tool evaluates detection, patching, and exploitation across three distinct modes.
  • GPT-5.3-Codex achieved 72.2% success rate in exploit mode testing.

ChatGPT maker OpenAI and crypto-focused investment firm Paradigm have introduced EVMbench, a tool to help improve Ethereum Virtual Machine smart contract security.

EVMbench is designed to evaluate AI agents’ ability to detect, patch, and exploit high-severity vulnerabilities in Ethereum Virtual Machine (EVM) smart contracts.

Related articles

Bitcoin Hit a Major Milestone—Most Miners Won’t Be Around for the Next One

Bitcoin Hit a Major Milestone—Most Miners Won’t Be Around for the Next One

March 14, 2026
US Treasury Sanctions Alleged $800 Million North Korean IT Worker Fraud Operation

US Treasury Sanctions Alleged $800 Million North Korean IT Worker Fraud Operation

March 13, 2026

Smart contracts are the heart of the Ethereum network, holding the code that powers everything from decentralized finance protocols to token launches. The weekly number of smart contracts deployed on Ethereum reached an all-time high of 1.7 million in November 2025, with 669,500 deployed last week alone, according to Token Terminal.

EVMbench draws on 120 curated vulnerabilities from 40 audits, most sourced from open audit competitions such as Code4rena, according to an OpenAI blog post. It also includes scenarios from the security auditing process for Tempo, Stripe’s purpose-built layer-1 blockchain focused on high-throughput, low-cost stablecoin payments.

Payments giant Stripe launched the public testnet for Tempo in December, saying at the time that it was being built with input from Visa, Shopify, and OpenAI, among others.

The goal is to ground testing in economically meaningful, real-world code—particularly as AI-driven stablecoin payments expand, the firm added.

Introducing EVMbench—a new benchmark that measures how well AI agents can detect, exploit, and patch high-severity smart contract vulnerabilities. https://t.co/op5zufgAGH

— OpenAI (@OpenAI) February 18, 2026

EVMbench is meant to evaluate AI models across three modes: Detect, patch, and exploit. In “detect,” agents audit repositories and are scored on their recall of ground-truth vulnerabilities. In “patch,” agents must eliminate vulnerabilities without breaking intended functionality. Finally, in the “exploit” phase, agents attempt end-to-end fund-draining attacks in a sandboxed blockchain environment, with grading performed via deterministic transaction replay.

In exploit mode, GPT-5.3-Codex running via OpenAI’s Codex CLI achieved a score of 72.2%, compared to 31.9% for GPT-5, which was released six months earlier. Performance was weaker in the detect and patch tasks, where agents sometimes failed to audit exhaustively or struggled to preserve full contract functionality.

The ChatGPT makers’ researchers cautioned that EVMbench does not fully capture real-world security complexity. Still, they added that measuring AI performance in economically relevant environments is critical as models become powerful tools for both attackers and defenders.

Sam Altman’s OpenAI and Ethereum co-founder Vitalik Buterin have previously been at odds over the pace of AI development.

In January 2025, Altman said that his firm was “confident we know how to build AGI as we have traditionally understood it.” But Buterin advocated that AI systems should include a “soft pause” capability that could temporarily restrict industrial-scale AI operations if warning signs emerge.

Daily Debrief Newsletter

Start every day with the top news stories right now, plus original features, a podcast, videos and more.





Source link

Share77Tweet48

Related Posts

Bitcoin Hit a Major Milestone—Most Miners Won’t Be Around for the Next One

Bitcoin Hit a Major Milestone—Most Miners Won’t Be Around for the Next One

by admin
March 14, 2026
0

In brief The Bitcoin network mined its 20 millionth coin this week, leaving just 1 million remaining—a supply that could...

US Treasury Sanctions Alleged $800 Million North Korean IT Worker Fraud Operation

US Treasury Sanctions Alleged $800 Million North Korean IT Worker Fraud Operation

by admin
March 13, 2026
0

The U.S. Treasury Department on Thursday sanctioned six individuals and two entities linked to a North Korean government scheme, which...

Ethereum Price, BitMine Shares Jump as Tom Lee’s Treasury Reports Latest Buy

BitMine’s Tom Lee Joins Eightco Board as ORBS Stock Jumps on $125 Million Fundraise

by admin
March 12, 2026
0

In brief Eightco shares are rising after the firm announced a new $125 million fundraise from BitMine and Ark Invest....

Mastercard Recruits Binance, Ripple and PayPal for Crypto Partner Program

Mastercard Recruits Binance, Ripple and PayPal for Crypto Partner Program

by admin
March 11, 2026
0

In brief Mastercard launched a Crypto Partner Program with 85+ companies including Binance, Ripple, Circle, and PayPal to advance practical...

Polymarket, Peter Thiel’s Palantir Eye ‘Surveillance Models’ for Sports Prediction Markets

Polymarket, Peter Thiel’s Palantir Eye ‘Surveillance Models’ for Sports Prediction Markets

by admin
March 10, 2026
0

In brief Polymarket is creating surveillance systems for sports-focused prediction markets with Palantir, the firm known for its work with...

Load More
  • Trending
  • Comments
  • Latest
XRP price holds firm amid 30% volume spike

XRP price holds firm amid 30% volume spike

December 26, 2025
Lido DAO’s LDO price spikes as Arthur Hayes acquires 1.85M tokens

Lido DAO’s LDO price spikes as Arthur Hayes acquires 1.85M tokens

December 26, 2025
Solana Pullback Finds Purpose As Strong Hands Eye Accumulation Below $160

Solana Pullback Finds Purpose As Strong Hands Eye Accumulation Below $160

November 6, 2025
Bitcoin hashprice sinks to 2-year low as AI pivots split miners

Bitcoin hashprice sinks to 2-year low as AI pivots split miners

November 5, 2025

US Commodities Regulator Beefs Up Bitcoin Futures Review

0

Bitcoin Hits 2018 Low as Concerns Mount on Regulation, Viability

0

India: Bitcoin Prices Drop As Media Misinterprets Gov’s Regulation Speech

0

Bitcoin’s Main Rival Ethereum Hits A Fresh Record High: $425.55

0
Bitcoin Hit a Major Milestone—Most Miners Won’t Be Around for the Next One

Bitcoin Hit a Major Milestone—Most Miners Won’t Be Around for the Next One

March 14, 2026
Playnance plans to list utility token G Coin on March 18

Playnance plans to list utility token G Coin on March 18

March 14, 2026
Judge Rejects RICO Claims in Lawsuit Over Pastor-Led Crypto Ponzi Scheme

Judge Rejects RICO Claims in Lawsuit Over Pastor-Led Crypto Ponzi Scheme

March 14, 2026
Balaji Urges Crypto Industry to Build Tools for Refugees

Balaji Urges Crypto Industry to Build Tools for Refugees

March 14, 2026

Recent News

Bitcoin Hit a Major Milestone—Most Miners Won’t Be Around for the Next One

Bitcoin Hit a Major Milestone—Most Miners Won’t Be Around for the Next One

March 14, 2026
Playnance plans to list utility token G Coin on March 18

Playnance plans to list utility token G Coin on March 18

March 14, 2026

Categories

  • Bitcoin
  • Blockchain
  • Business
  • Ethereum
  • Guide
  • Market
  • Regulation
  • Ripple
  • Uncategorized
  • About
  • FAQ
  • Support Forum
  • Landing Page
  • Contact Us

© Copyright 2025 All Rights Reserved.

No Result
View All Result
  • Contact Us
  • Homepages
  • Business
  • Guide

© Copyright 2025 All Rights Reserved.