The fundamental flaw in the current AI agent boom is a mistaken trust model, according to Gu.

Charles Hoskinson, founder and CEO of Cardano’s Input Output, said that by 2035 they will become more relevant than humans on the internet. Coinbase CEO Brian Armstrong, recently said “very soon there are going to be more AI agents than humans making transactions” and Binance Founder Changpeng Zhao, predicted they “will make one million times more payments than humans.”

Ultimate inside threat

Gu said many popular, open-source AI applications are built under the assumption that because they run locally on a user’s computer or connect via standard chat apps like WhatsApp, they are safe from external threats.

The reality is entirely the opposite, he noted. The moment a user grants an AI agent permission to read local system storage, view execution histories or manage personal email and business database credentials, that agent becomes the ultimate inside threat.

CertiK’s recent analysis of early-state, rapidly growing agent structures uncovered a staggering accumulation of security vulnerabilities, including hundreds of critical security advisories, unpatched common vulnerabilities and exposures (CVEs) and other massive exposures of local credentials and session memories resulting from completely inconsistent boundary checks.

More alarming yet is how easily these autonomous systems can be completely redirected at the reasoning layer without a single line of malicious code ever being written, Gu emphasized.

Through basic “prompt injection” attacks, a bad actor can embed hidden natural language instructions inside a benign webpage, a PDF document, or an incoming email, he added.

When the unisolated AI agent reads that file to process a task for the user, it fails to separate trusted system commands from the untrusted external data, Gu explained. The agent then silently overwrites its original rules, obeys the malicious instruction, and can be forced to exfiltrate data or trigger unauthorized fund transfers.

Hyperfast exploits

Gu revealed that CertiK discovered hundreds of malicious skills, fake installers, and lookalike dependency packages sitting directly on open agent utility hubs. Because these malicious plug-ins use standard natural language to subtly influence the agent’s behavior and change its goals, they completely bypass legacy, signature-based antivirus software.

“The scam apps use natural language to influence behavior, making them totally resistant to traditional antivirus scans,” Gu explained. “And right now, it is even easier to scam the machine than it is to scam a human.”

In what Gu describes as a bizarre evolution of financial crime, CertiK’s telemetry has observed an explosion of onchain, automated scams that run for only 10 minutes or a few hours before completely vanishing.

These hyperfast, ephemeral exploits are specifically designed by hackers to target and scam other autonomous AI trading bots and automated agent systems, executing machine-on-machine financial drainage before any human even realizes a compromise has occurred.

Gu states that the software engineering industry must completely abandon its reliance on trust-based interactions and move immediately toward an isolated, “Zero Trust” architecture where every command and dependency is continuously verified.

More For You

Hacker facing screens with lines of code (Boitumelo/Unsplash)

The campaign targets crypto, DeFi, AI and security developers with fake tooling packages to steal wallets, SSH keys, GitHub tokens, cloud credentials and browser data.

What to know:

  • A newly discovered supply-chain campaign called TrapDoor has planted more than 34 malicious packages across npm, PyPI and Crates.io to target crypto and cloud developers.
  • The packages, disguised as mundane developer utilities and security tools, were designed to steal SSH keys, wallet files, AWS credentials, GitHub tokens, browser data and…

About the Author

Leave a Reply

Your email address will not be published. Required fields are marked *

Related Stories