MCP Trust Assessment

Part of the Adoption Curve series · February 2026

The Model Context Protocol is becoming the USB standard for AI agents. Like USB, the question is what happens when you plug in something you shouldn't trust.

What Is MCP?

The Model Context Protocol, introduced by Anthropic in November 2024, is an open standard for connecting AI models to external tools and data sources. Instead of building custom integrations for every tool, MCP provides a universal interface: one protocol, any tool. It's been adopted by Claude, Cursor, VS Code, and thousands of community implementations.

It solved a real problem. And in doing so, it created a new attack surface that the security community is still mapping.

What It Does Well

MCP's architecture has genuine security advantages. The spec team shipped three major authorisation revisions in 2025 alone, responding actively to discovered issues. The security research community mobilised quickly with OWASP publishing a dedicated MCP Top 10 framework.

Protocol-Level Boundaries

Security policies enforced at the protocol layer. The client/server/host separation creates natural trust boundaries that enable Zero Trust principles.

Local-First Design

In the stdio transport model, data doesn't leave the device unless explicitly approved. Human-in-the-loop confirmation is built into the spec.

Rapid Spec Iteration

Three auth revisions in 2025 (March, June, November). The team responds to vulnerabilities. OAuth 2.1 with PKCE is now the recommended baseline.

The Core Problem

The spec is sound. The ecosystem isn't.

MCP is not insecure by design—it's insecure by default in practice. The protocol assumes careful implementation, user vigilance, and trusted servers. The real world delivers none of these consistently.

Risk Matrix

Critical— Proven in the wild

Tool Poisoning

Malicious instructions can be hidden inside MCP tool descriptions. They’re invisible to users in most client UIs but fully visible to the AI, which follows them as if they were legitimate. Invariant Labs demonstrated success rates above 70%.

Cross-Server Data Exfiltration

When multiple MCP servers connect to the same client, a malicious server can instruct the AI to read data from your other servers and send it externally. Invariant Labs demonstrated full WhatsApp message history exfiltration this way.

First Malicious MCP Server on npm

In September 2025, the postmark-mcp package was discovered on npm — a deliberately malicious MCP server designed to harvest and exfiltrate emails from AI workflows. The supply chain attack vector is now proven.

High— Documented attack vectors

Rug Pull Attacks

A tool behaves legitimately on Day 1 to earn trust and approval. After approval, the server silently updates its behaviour to include malicious actions. Most MCP clients don’t re-notify users when tool definitions change.

Credential Exposure at Scale

A study of 5,200 MCP servers found only 8.5% use OAuth. Over half rely on static API keys, and 79% pass credentials through environment variables. These keys rarely rotate and have no usage monitoring.

Command Injection in Implementations

Elastic Security Labs found 43% of tested MCP implementations contained command injection flaws. 30% permitted unrestricted URL fetching, opening doors to SSRF attacks against internal networks.

Output Poisoning

Tool responses — not just descriptions — can contain hidden instructions. CyberArk demonstrated that even error messages and return values can manipulate the AI’s subsequent actions.

Medium— Ecosystem and governance risks

Enterprise Auth Spec Immaturity

The MCP authorisation spec has been revised three times in 2025 alone. Enterprise integration remains difficult, and what you build today may need significant revision as the protocol evolves.

Shadow MCP Servers

Developers can connect MCP servers to their AI tools without IT knowledge or approval. OWASP lists this as a Top 10 MCP risk — invisible integrations that bypass organisational security policy.

Anthropic Doesn’t Audit the Ecosystem

Anthropic explicitly states they do not manage or audit any MCP servers. Their guidance is “only connect to trusted MCP servers” — but trust verification is left entirely to the user.

CVE Timeline

Eight months of escalating discoveries. The pattern is clear: vulnerabilities are being found faster than they're being fixed.

Apr 2025High

WhatsApp MCP — Full message history exfiltration via tool poisoning

May 2025Critical

Anthropic MCP Inspector — Unauthenticated RCE; filesystem and API keys exposed (CVE-2025-49596)

Jun 2025Critical

mcp-remote (OAuth proxy) — OS command injection; 437K+ downloads affected; CVSS 9.6 (CVE-2025-6514)

Jul 2025Critical

Anthropic Filesystem MCP — Sandbox escape via directory containment + symlink bypass (CVE-2025-53109/53110)

Aug 2025Critical

Smithery Registry — Path traversal exposing credentials across 3,000+ tenants

Sep 2025Critical

postmark-mcp (npm) — First confirmed malicious MCP server; harvested and exfiltrated emails

Sep 2025High

Anthropic Git MCP Server — Three RCE vulnerabilities via prompt injection and argument injection (CVE-2025-68143/44/45)

Oct 2025High

Framelink Figma MCP — Command injection via unsanitised input (CVE-2025-53967)

The Lethal Trifecta

Simon Willison identified the fundamental architectural risk that makes MCP uniquely dangerous compared to traditional API integrations:

Private Data

MCP servers connect to your files, databases, messages, and credentials. The AI sees everything the server exposes.

Untrusted Instructions

Tool descriptions, responses, and error messages can all contain hidden directives that the AI follows without question.

Exfiltration Pathways

Other MCP servers, HTTP requests, and tool outputs provide multiple channels to send private data externally.

When all three conditions are present, any MCP server can potentially access and exfiltrate data from any other server connected to the same client.

The Practitioner's Dilemma

Here's the uncomfortable truth: if you're already working in an agentic coding environment with shell access, you can do most of what MCP does through the CLI. Direct tool access, battle-tested security models, full command transparency, no extra protocol layer.

MCP's real value is at the ecosystem level—standardising how any AI talks to any tool. At the individual practitioner level, it often adds complexity and attack surface for marginal convenience. The protocol is most valuable for tool discovery (advertising what's available), structured schemas (typed inputs and outputs instead of string parsing), portability across AI clients, and making tool access possible for non-developers who would never open a terminal.

This assessment was written using an AI agent running MCP servers for research and content management. The protocol being critiqued is the one powering the critique. That's the dilemma—MCP is genuinely useful, which is exactly why getting the security right matters.

Hardening Recommendations

The trust model needs to shift from "trust the user to vet everything" to "trust nothing, verify everything, contain blast radius."

Immediate

For All MCP Users

Only install MCP servers from sources you can audit. First-party and well-known open-source only.
Never enable auto-approval for tool calls. Review every action the AI takes through MCP tools.
Minimise the number of servers connected simultaneously. Cross-server exfiltration requires multiple servers.
Pin MCP server versions. Don't allow automatic updates without review.
Use dedicated API keys with minimal scopes and spending limits. Rotate regularly.

Enterprise

For Organisations

Maintain an approved MCP server registry. Vet and sign servers before allowing deployment.
Run MCP servers in containerised environments with network isolation. No direct access to production systems.
Integrate secrets management (Vault, AWS Secrets Manager) instead of environment variables.
Deploy mcp-scan or equivalent tooling in CI/CD pipelines. Automated scanning for tool poisoning and known vulnerabilities.
Monitor for Shadow MCP Servers. Developers connecting unapproved servers is an OWASP Top 10 MCP risk.

Hard No-Go Criteria

Do not deploy MCP in production if any of these apply:

• Auto-approval is enabled for tool calls (no human in the loop)
• MCP servers have direct access to production databases or credentials
• No process exists for vetting and approving new MCP server installations
• Regulatory data (HIPAA, PCI-DSS, SOX) is accessible through connected MCP servers
• Multiple untrusted MCP servers connect to the same client session

Bottom Line

The right standard with the wrong defaults.

MCP solved the right problem—AI needs a universal way to connect to tools. The architecture is sound, the spec is improving, and the security community is engaged. But 315 vulnerabilities in year one, the first malicious server already on npm, and an ecosystem where 53% of servers rely on static credentials tells you where we are. Use MCP. Harden it first.

Sources & Further Reading

This is the protocol layer assessment in the Adoption Curve series—tracking how each layer of the AI stack democratizes faster than it's secured.

The Adoption Curve OpenClaw Assessment