The Two Faces of AI Agents on the Blockchain

Key Takeaways

As AI services rapidly evolve, they are becoming deeply embedded in everyday life and are increasingly developing into automated tools connected to external data. The Model Context Protocol (MCP), which enables this, enhances efficiency but also introduces security vulnerabilities such as tool poisoning and remote code execution.
These issues are not limited to MCP alone but are commonly found across most AI agent environments, including ChatGPT plugins and LangChain. The threat is particularly significant when AI is combined with blockchain, as it directly handles assets—yet most projects remain ill-prepared for such risks.
To secure AI agents, countermeasures such as tool integrity verification, the principle of least privilege, and user approval procedures are essential. Since AI is not merely a bot but a system that makes autonomous decisions, blockchain projects utilizing AI agents must incorporate appropriate security designs moving forward.

1. Background - AI Tools Can Be Attacked

These days, when people have questions, they no longer go straight to Google. Instead, they increasingly turn to AI services like ChatGPT, Gemini, or Perplexity. This indicates that AI has already deeply permeated our daily lives, changing how we live and think.

These widely-used AI services are rapidly evolving to directly access more information and data. However, the more accessible they become, the more attempts there are to exploit them. The existence of open pathways to information and data means someone will try to enter, and the more valuable the information inside, the stronger the incentive for attacks.

As AI systems advance, these malicious attempts are becoming more sophisticated. When combined with other industries involving money, such as blockchain or financial systems, such attacks can no longer be dismissed as minor threats, as they could potentially lead to real financial loss or social disruption.

1.1 The Rise of MCP and Attack Methods Targeting It

1.1.1 Model Context Protocol (MCP): The USB-C Port of the AI World

The Model Context Protocol (MCP) serves as a universal connector for AI systems. Much like how USB-C allows various devices to connect through a single port, MCP enables AI to access a wide range of external data and development tools using a standardized method. In the past, integrating AI with specific tools or systems required custom development, which increased both maintenance and development costs. MCP eliminates this complexity, offering a standardized way for AI to seamlessly connect with multiple systems. This allows AI models to interact with external data, access real-time information, and call external system functions—providing practical capabilities.

AI that can communicate via MCP evolves beyond a simple question-and-answer engine into an automated tool linked to real-world tasks.

In particular, Zapier uses MCP to implement its workflow automation service, handling millions of requests. Similarly, Cursor integrates powerful AI features into existing code editors (especially those based on VS Code) using MCP, boosting developer productivity. In this way, the combination of AI and MCP is rapidly enhancing productivity across various fields.

1.1.2 Attack Methods Threatening MCP

However, AI has a critical weakness—it executes instructions entered into the prompt exactly as they are, regardless of the user’s true intent. With MCP providing a channel for accessing external data and system functions, a variety of attack methods that exploit AI have emerged.

One prominent attack method is the Tool Poisoning Attack (TPA). TPA exploits legitimate user requests to trick an AI agent into performing unintended or malicious actions. The attack typically unfolds as follows:

The attacker inserts maliciously altered information (e.g., a compromised MCP service or malicious tool) into the context area that the AI agent references.
When a user sends a request, the AI agent consults both the user’s request and the tainted context.
The AI agent then calls the malicious tool or service within the context, ultimately carrying out the attacker’s desired action.

For example, consider a malicious tool like the one shown below:

Source: Invariantlabs

This tool appears to simply perform addition. However, it contains a hidden malicious command embedded in a comment unrelated to its functionality. Before executing the addition, the AI interprets and runs the instruction inside <IMPORTANT>, which reads sensitive files from the user’s home directory (e.g., “~/.cursor/mcp.json” and “~/.ssh/id_rsa”, such as configuration files or SSH private keys). It then includes the content in a parameter named sidenote when calling the function.

From the user’s perspective, they believe they are just using a basic addition function. In reality, they are unknowingly exposing sensitive data within the AI system or logs. To make detection even harder, the attacker instructs the AI not to mention that it accessed any files, keeping the user unaware of the breach.

Other major attack methods include:

Tool Rug Pull: This is an attack in which an AI tool that initially operates normally is later replaced with a malicious one. Because MCP does not verify tool integrity or perform signature checks, any changes made to the tool's contents go undetected. As a result, users may unknowingly use the altered tool and inadvertently hand over information such as API keys to attackers.
Cross-Server Tool Shadowing: When MCP utilizes multiple servers, a malicious server can intercept or alter the functionality of tools hosted on other servers. Since the AI aggregates tool descriptions from all servers, it cannot distinguish which ones are trustworthy. Consequently, a malicious command may be executed while the user log still reflects only the use of legitimate tools.

Other attacks include Remote Code Execution (RCE) via shell commands, OAuth token theft, and excessive API privilege escalation, among others.

The core reason these issues arise is that MCP was designed from the outset to prioritize flexibility and integration over security. It lacks modules responsible for authentication and trust, and offers no features for context encryption or tool verification. In particular, the structure that delegates all decision-making to the AI is inherently vulnerable.

The most fundamental defense is for users to manually review the context and tool definitions. However, due to MCP’s code-based architecture, user visibility is limited—making this difficult to put into practice.

1.2 These Vulnerabilities Aren’t Unique to MCP

The problem is that these security vulnerabilities aren’t limited to MCP alone. Most AI agent environments also collect and manipulate data through external APIs or plugins, and even if not using MCP, they typically rely on their own APIs or SDKs.

In fact, security flaws stemming from plugins or external API calls have been repeatedly identified across various AI platforms. For example, in March 2024, Salt Security discovered multiple vulnerabilities in ChatGPT plugins that could lead to account takeovers and exposure of sensitive user data. These included authentication bypasses during plugin installation and code injection flaws, allowing attackers to steal session tokens or access external resources without user interaction.

Open-source agent frameworks like LangChain are no exception. According to a report by Unit 42, certain modules in LangChain were found to have vulnerabilities that enabled arbitrary code execution and sensitive data leakage. Its SQL integration features, in particular, were susceptible to SQL injection attacks. These vulnerabilities could allow attackers to gain system privileges or manipulate internal data through software development kits (SDKs) or chain components.

In the end, it’s not just MCP—any AI agent environment that communicates with external systems via plugins, SDKs, or extensions is subject to similar vulnerabilities.

2. Takeaway - Blockchain: Where AI Agents and Finance Most Closely Converge

2.1 Blockchain Projects Actively Adopting AI Agents

AI agents are emerging as a core technology within the blockchain ecosystem. They are evolving beyond simple bots into systems that can autonomously perform on-chain tasks, interact with users via natural language, and even carry out economic actions on their behalf.

Here are some notable examples:

Fetch.ai: Fetch.ai aims to build a decentralized ecosystem for autonomous AI agents. Using uAgents, Agentverse, and the AI Engine, it allows users to register and run diverse AI agents on-chain. These agents can control asset trading, analyze on-chain data, and interact with smart wallets through natural language. They also support real-time collaboration and data processing via the DeltaV interface and the SQD.ai oracle.
Eliza OS: Eliza OS is an open-source AI agent framework that helps anyone build AI systems interacting with various blockchains. It integrates with social platforms and offers a flexible plugin-based structure for trading, data oracles, and chatbot development. Notably, its plugins enable cross-chain agents compatible with Ethereum, Solana, Injective, and more—offering high scalability.
Virtuals Protocol: This protocol envisions an environment where AI agents autonomously build products, trade services, and interact freely. Each agent has its own token, and their interactions and value creation are recorded on-chain. Through its GAME (Generative Autonomous Modular Environment) Framework, agents make autonomous decisions based on context and AI tools, while connecting with platforms like Roblox and Telegram.
Injective's iAgent: iAgent is an AI tool specialized for DeFi, enabling users to generate on-chain transactions, manage portfolios, and perform data analysis via prompt commands. After integrating with Eliza OS, it now supports multi-agent systems, custom integrations, and real-time event responses—positioning itself as a powerful automation solution in Web3.

Some blockchain projects are already integrating AI agents that directly utilize MCP:

Thirdweb: Thirdweb offers Insight, Engine, and Nebula—three core services that connect AI agents to EVM-compatible blockchains via MCP, enabling streamlined development and interaction.
SkyAI: SkyAI is building a blockchain-native AI infrastructure for Web3 applications. By leveraging MCP, it supports multi-chain data access and AI agent deployment, aiming to simplify development and expand AI’s practical usage across blockchains.
ArcBlock: ArcBlock integrates MCP into its AIGNE platform to address identity verification, security, and financial challenges—MCP’s current limitations—through decentralized identifiers (DIDs) and cryptocurrency payment systems.

In summary, AI agents are not merely an add-on to blockchain—they are reshaping the way blockchain systems interface, automate, and interact. As more projects adopt AI and harness on-chain data, they are likely to build increasingly autonomous and user-friendly ecosystems for both developers and end users.

2.2 Have You Considered the Security of AI Agents?

Blockchain is a technology closely tied to finance—and AI agents are rapidly integrating into this space. However, despite the fact that a single misjudgment by an AI could directly lead to financial losses, most blockchain projects currently lack adequate security measures to address this risk. In fact, in the official documentation of Fetch.ai, Eliza OS, Virtuals Protocol, and Injective mentioned earlier, there are few—if any—clear discussions about attack vulnerabilities or defense strategies for AI tools.

The table above summarizes how major blockchain-based AI agent platforms are responding (or not responding) to key security threats they may face. Most platforms lack direct countermeasures against attacks like tool poisoning, tool redefinition, cross-server shadowing, and remote code execution (RCE), or attempt to mitigate risks only through indirect limitations.

Virtuals Protocol, for example, tries to partially address these threats via economic restrictions built into its ACP (Agent Commerce Protocol) and escrow mechanisms. Other platforms impose constraints like import restrictions or message validation, which offer some risk reduction—but still fall short of comprehensive security strategies.

This kind of negligence is dangerous. When AI agents are designed to fully trust external APIs or oracle data and make automated decisions based on that input, attackers can exploit the system to trigger catastrophic consequences. Tool Poisoning Attacks (TPA) in particular enable the following type of scenario:

A user manages assets via an AI-powered smart wallet.
An attacker manipulates the external oracle or API response that the AI agent references, making a malicious address appear as a legitimate one.
The AI agent accepts the manipulated data without validation and transfers the user's assets to the attacker's address.

Blockchain-specific structural risks also exacerbate the threat:

Manipulated Price Feeds: If an AI agent executes trades based on oracle-provided prices, a compromised oracle could mislead it into executing unfavorable transactions for the user.
Malicious Smart Contract Interactions: An attacker may disguise a malicious smart contract as a trustworthy tool, leading the AI agent to call it. This can result in asset theft or contract abuse.
Tampered Transaction Signing Logic: If the library or API used for signing is exposed to TPA, the agent could be tricked into signing malicious transactions.
Log and Error Message Manipulation: Attackers might trick the agent into sending logs or debug data externally, potentially leaking private keys or other sensitive information.
Data Query API Tampering: When querying balances or transaction history, attackers can alter the response to mislead users or exfiltrate data during the process.

These scenarios highlight the critical need for stronger safeguards in blockchain-integrated AI agents—especially those entrusted with financial decision-making.

2.3 Preventing Disasters Before They Happen

So, how can we prepare for these threats?

First, there needs to be a robust system for verifying the integrity of AI tools and APIs. This includes implementing signature-based deployment methods and validation processes within Trusted Execution Environments (TEEs). Additionally, tool definitions and metadata should be recorded on an on-chain registry, allowing for authenticity verification. This minimizes the risk of tools being tampered with or altered maliciously.

Next, strong guardrails must be put in place at the prompt level, where the AI’s input and output are handled. Prompt structures should be modular and designed to prevent the injection of malicious commands. A secondary validation system should also be introduced to filter out unexpected or harmful instructions. These safeguards can prevent AI agents from unintentionally executing dangerous actions.

Furthermore, the principle of least privilege must be strictly enforced. Sensitive operations should always require explicit user approval. For example, when an AI agent attempts to access critical on-chain data or interact with external systems related to assets, it should not proceed without the user's clear and confirmed consent.

Lastly, accountability must extend to AI tool providers. This can be achieved by defining clear Service Level Agreements (SLAs) and establishing a responsibility framework tied to tokenomics. Specifically, providers could be required to stake a certain amount of tokens, which may be slashed if their tools cause harm. DAO-based governance could also impose penalties, creating a system that balances trust with accountability.

Ultimately, AI agents must be treated not as mere software, but as “automated authority systems”—and security models must be applied accordingly. AI is no longer just a chatbot. It has evolved into a decision-making entity, capable of making autonomous judgments and taking action. Once connected to blockchain data or sensitive information, this automated decision-making power becomes both a strength and a critical vulnerability.

As data integration via MCP becomes more widespread, the equation “automated decision-making + external data integration = attack target” is becoming a reality. In this context, traditional API-centric security models are no longer sufficient.

Blockchain projects that operate AI agents must acknowledge these risks as real and imminent, and embed appropriate security mechanisms directly into their system design. If they don’t, the question is no longer “if” a major breach targeting user assets will happen—but “when”.

3. Resource

Related Articles, News, Tweets etc. :