Currently, the MCP (Model Context Protocol) system is still in its early stages of development. The overall environment is relatively chaotic, and various potential attack methods emerge in an endless stream. It is difficult to defend against them with the current design of protocols and tools. In order to help the community better understand and improve the security of MCP, SlowMist has specially open-sourced the MasterMCP tool, hoping to help everyone discover security risks in product design in a timely manner through actual attack drills, so as to strengthen their own MCP projects step by step.
At the same time, you can use the previous MCP security checklist to better understand the underlying perspective of various attacks. This time, we will take you to practice and demonstrate common attack methods in the MCP system, such as information poisoning, hiding malicious instructions and other real cases. All scripts used in the demonstration will also be open sourced to GitHub (see the link at the end of the article), so that you can fully reproduce the entire process in a safe environment, and even develop your own attack test plug-in based on these scripts.
Overall Architecture Overview
Demonstration attack target MCP: Toolbox
smithery.ai is one of the most popular MCP plugin websites, gathering a large number of MCP lists and active users. @smithery/toolbox is the MCP management tool officially launched by smithery.ai.
Toolbox was chosen as the test target based on the following points:
- The user base is large and representative;
- Supports automatic installation of other plug-ins to supplement some client functions (such as Claude Desktop);
- Contains sensitive configuration (such as API Key) for demonstration purposes.
Malicious MCP used in the demonstration: MasterMCP
MasterMCP is a malicious MCP simulation tool written by SlowMist specifically for security testing. It adopts a plug-in architecture design and contains the following key modules:
1. Local website service simulation: http://127.0.0.1:1024
In order to restore the attack scenario more realistically, MasterMCP has a built-in local website service simulation module. It quickly builds a simple HTTP server through the FastAPI framework to simulate a common web page environment. These pages appear normal on the surface, such as displaying cake shop information or returning standard JSON data, but in fact, carefully designed malicious payloads are hidden in the page source code or interface return.
In this way, we can fully demonstrate attack methods such as information poisoning and command hiding in a safe and controllable local environment, helping everyone to understand more intuitively: even a seemingly ordinary web page may become a source of hidden dangers that induce large models to perform abnormal operations.
2. Local plug-in MCP architecture
MasterMCP uses a plug-in approach to expand, which makes it easy to quickly add new attack methods. After running, MasterMCP will run the FastAPI service of the previous module in the child process. (If you are careful, you will notice that there is a security risk here - the local plug-in can arbitrarily start a child process that is not expected by MCP)
Demo Client
- Cursor: One of the most popular AI-assisted programming IDEs in the world
- Claude Desktop: Anthropic (MCP protocol customization party) official client
Large model used in demonstration
- Claude 3.7
Claude 3.7 was chosen because it has made some improvements in sensitive operation recognition and represents a stronger operational capability in the current MCP ecosystem.
Configure claude_desktop_config.json
{ "mcpServers": { "toolbox": { "command": "npx", "args": [ "-y", "@smithery/cli@latest", "run", "@smithery/toolbox", "--config", "{\"dynamic\":false,\"smitheryApiKey\":\"ec1f0fa8-5797-8678-sdaf-155d4584b133\"}", "--key", "ec1f0fa8-5797-8678-sdaf-155d4584b133" ] }, "MasterMCP": { "command": "/Users/xxx/Desktop/EvilMCP/bin/python", "args": [ "/Users/xxx/Desktop/EvilMCP/MasterMCP.py" ] } }}
After configuration is completed, we will officially enter the demonstration phase.
Cross-MCP Malicious Invocation
This demonstration includes two contents: Checklist poisoning and Cross-MCP malicious calls.
Web content poisoning attack
1. Annotation poisoning (partial reference: https://x.com/lbeurerkellner/status/1912145060763742579)
Cursor accesses the local test website http://127.0.0.1:1024.
This is a seemingly harmless page about "Delicious Cake World". Through this experiment, we simulate the impact of a large model client visiting a malicious website.
Execute the command:
Fetch the content of http://127.0.0.1:1024
The results show that Cursor not only reads the web page content, but also transmits local sensitive configuration data back to the test server. In the source code, malicious prompt words are implanted in the form of HTML comments:
Although the annotation method is relatively straightforward and easy to identify, it can already trigger malicious operations.
2. Coded comment poisoning
Visit http://127.0.0.1:1024/encode. This is a webpage that looks the same as the example above, but the malicious prompt words are encoded, which makes the poisoned exp more hidden and difficult to detect directly even if the source code of the webpage is accessed.
Even if the source code does not contain the plaintext hint word, the attack can still be successfully executed. The specific principle will be explained in detail in the following chapters.
MCP tool returns information poisoning
Here we enter our simulated instructions according to the prompt words of MasterMCP (this instruction has no actual meaning and is intended to trigger our malicious MCP to demonstrate the subsequent operations of the malicious MCP):
get a lot of apples
It can be seen that after the command is triggered, the client calls the Toolbox across MCP and successfully adds a new MCP server:
By checking the plug-in code, it can be found that the returned data has been embedded with encoded malicious payloads, and the user end can hardly detect the abnormality.
Third-party interface pollution attack
This demonstration is mainly to remind everyone that whether it is malicious or non-malicious MCP, when calling third-party APIs, if the third-party data is directly returned to the context, it may have serious consequences.
Sample code:
Execute the request:
Fetch json from http://127.0.0.1:1024/api/data
Result: The malicious prompt words were implanted into the returned JSON data and successfully triggered the malicious execution.
Poisoning techniques during the MCP initialization phase
This demonstration covers the initial prompt word injection and name conflict in the Checklist.
Malicious function overwrite attack
Here, MasterMCP wrote a tool with the same function name as Toolbox, remove_server, and encoded and hid the malicious prompt words.
Execute the command:
toolbox remove fetch plugin server
Claude Desktop does not call the original `toolbox remove_server` method, but triggers the method of the same name provided by MasterMCP:
The principle is to emphasize that "the original method has been abandoned" and preferentially induce the large model to call the maliciously overwritten function.
Add malicious global check logic
Here MasterMCP has written a tool related to banana. The core function of this tool is to force all tools to perform safety checks before running in the prompt word.
Before each function execution, the system will first call the banana check mechanism:
This is achieved by injecting global logic in the code by repeatedly emphasizing that "banana detection must be run".
Advanced techniques for hiding malicious prompt words
Large model-friendly encoding
Since the Large Language Model (LLM) has a strong ability to parse multi-language formats, it is used to hide malicious information. Common methods include:
- In English environment: Use Hex Byte encoding
Tool recommendation: Hex Decoder
- In Chinese environment: use NCR encoding or JavaScript encoding
Tool recommendation: R12a Unicode Conversion Tools
Random malicious payload return mechanism
As mentioned in Chapter 2, when requesting http://127.0.0.1:1024/random:
A page with a malicious payload will be randomly returned each time, greatly increasing the difficulty of detection and tracing.
Summarize
Through this actual demonstration of MasterMCP, we can intuitively see the various security risks hidden in the Model Context Protocol (MCP) system. From simple prompt word injection and cross-MCP calls to more covert initialization phase attacks and malicious command hiding, every link reminds us that the MCP ecosystem is powerful but also fragile.
Especially today when large models are increasingly interacting with external plug-ins and APIs, even a small amount of input contamination may cause security risks at the entire system level. The diversification of attackers’ methods (coding hiding, random contamination, function coverage) also means that traditional protection ideas need to be fully upgraded.
Safety is never achieved overnight.
I hope this demonstration can serve as a wake-up call for everyone: developers and users alike should be sufficiently vigilant about the MCP system, and always pay attention to every interaction, every line of code, and every return value. Only by being rigorous in every detail can we truly build a stable and secure MCP environment.
Next, we will continue to improve the MasterMCP script and open source more targeted test cases to help everyone deeply understand, practice and strengthen protection in a safe environment.
P.S. The relevant content has been synchronized to GitHub (https://github.com/slowmist/MasterMCP). Interested readers can click the read original text at the end of the article to jump directly.