Security2026-03-29

How to Audit MCP Server Security Before Installing - 2026 Checklist

TeamSecurity Team

Why MCP Server Security Matters

MCP (Model Context Protocol) servers give AI agents powerful capabilities — file access, API calls, database queries, browser control. But every capability is also a potential attack surface. A poorly built MCP server can leak your API keys, execute arbitrary code, or exfiltrate data to unknown endpoints.

As the MCP ecosystem grows past 4,000+ servers, the quality gap is widening. Some servers are built by experienced teams with proper security reviews. Others are weekend projects with hardcoded secrets and no input validation. This guide helps you tell the difference before you install.

The 7-Point Security Audit Checklist

Before adding any MCP server to your agent stack, run through these checks:

1. Permission Scope: Does it ask for too much?

A good MCP server requests only the permissions it needs. A file search server should not need network access. A weather API server should not need filesystem write permissions.

Red flag: Servers that request broad permissions like "full filesystem access" or "unrestricted network" for simple tasks.

What to check: Read the server's skill.json or manifest file. Compare requested permissions against the stated functionality. If a calculator server wants to read your SSH keys, walk away.

2. Secret Handling: Are API keys stored safely?

MCP servers often need API keys to connect to external services. The question is how they handle those keys.

Red flag: Hardcoded API keys in source code, keys stored in plaintext config files committed to git, or keys logged to stdout during operation.

What to check: Search the repository for patterns like sk-, api_key =, token =. Check .gitignore for .env exclusions. Look for environment variable usage (process.env.API_KEY) instead of hardcoded strings.

3. Dependency Health: Are dependencies maintained?

A server with 200 npm dependencies from unmaintained packages is a supply chain risk. Each dependency is code you are trusting to run on your machine.

Red flag: Dozens of dependencies with no lockfile, dependencies with known CVEs, or packages last updated years ago.

What to check: Run npm audit or pip audit on the project. Check dependency count — simpler is safer. Look at the lockfile age and whether it is committed.

4. Network Behavior: Where does data go?

Some MCP servers phone home — sending telemetry, usage data, or even your prompts to external servers. This is especially concerning for servers that process sensitive business data.

Red flag: Outbound HTTP calls to unknown domains, analytics/tracking SDKs, or WebSocket connections not documented in the README.

What to check: Search the codebase for fetch(, axios, requests.post, and WebSocket. Map every outbound URL. If a "local" server makes calls to analytics.example.com, that is data exfiltration.

5. Input Validation: Can prompts trigger dangerous actions?

MCP servers receive input from AI agents, which receive input from users. This creates a prompt injection chain — a user could craft input that makes the agent tell the MCP server to do something dangerous.

Red flag: No input sanitization, direct shell command execution from user input, or SQL queries built from string concatenation.

What to check: Look for exec(, eval(, subprocess.run(shell=True), or raw SQL queries. These are the most common injection vectors in MCP servers.

6. Update Cadence: Is the project maintained?

An unmaintained MCP server will not get security patches when vulnerabilities are discovered in its dependencies. The MCP spec is also evolving — stale servers may use deprecated patterns.

Red flag: No commits in 6+ months, unresolved security issues in the issue tracker, or a single-author project with no contribution activity.

What to check: Check the last commit date, open issue count, and release frequency on GitHub. A healthy project has activity within the last 30-60 days.

7. Community Trust Signals

While not a technical check, community signals help prioritize which servers to audit deeply.

Trust indicators: High GitHub stars, multiple contributors, official org backing (Anthropic, OpenAI, Vercel, etc.), presence on curated lists like awesome-mcp-servers.

Caution indicators: Brand-new repo with few stars, single contributor, no README or documentation, fork of a popular project with unexplained modifications.

How Our Security Grading Works

Every skill listed on Agent Skills Hub receives an automated security grade (A through F) based on a subset of these checks:

  • Grade A (90-100): No known vulnerabilities, minimal dependencies, proper secret handling, active maintenance.
  • Grade B (80-89): Minor issues found — perhaps a few outdated dependencies or missing input validation on non-critical paths.
  • Grade C (70-79): Moderate issues — some dependency CVEs, broad permission requests, or inconsistent secret handling.
  • Grade D (50-69): Significant concerns — known vulnerabilities, hardcoded secrets, or lack of maintenance.
  • Grade F (below 50): Critical security issues. Not recommended for production use.

You can see the security grade on every skill detail page. Use it as a starting point, then apply the full checklist above for servers you plan to use in production.

Quick Decision Framework

When evaluating an MCP server, use this priority order:

  1. Official first: Prefer servers published by the service provider (e.g., Stripe's own MCP server over a third-party Stripe connector).
  2. Stars + activity: Among community servers, prefer those with more stars AND recent commits. Stars alone mean nothing if the project is abandoned.
  3. Minimal scope: Prefer servers that do one thing well over Swiss-army-knife servers that request every permission.
  4. Read the code: For any server that handles sensitive data, spend 10 minutes reading the main entry point. If you cannot understand what it does, do not install it.

Conclusion

The MCP ecosystem is powerful but young. Security standards are still forming, and not every published server meets production-grade requirements. By applying this checklist before installing, you protect your data, your API keys, and your users. Browse our skill directory to find security-graded MCP servers you can trust.

How to apply this guidance in real workflows

Security advice is only useful when it changes implementation behavior. After reading this article, convert the recommendations into a short operational checklist for your team. Start by identifying where the discussed risk appears in your stack today, then assign one owner for validation and one owner for rollout. Shared ownership prevents common drift where findings are acknowledged but never implemented.

Next, classify actions by urgency. Immediate controls should block critical failure paths, such as unsafe command execution, secret leakage, or unreviewed external integrations. Secondary actions can improve observability, documentation quality, and long-term resilience. Separating urgent controls from structural improvements keeps momentum high while still building durable safeguards.

Teams adopting AI agent tooling often underestimate configuration risk. Even when a package is well maintained, local setup can introduce weak points through permissive environment variables, broad network access, or unclear update practices. Use this article as a trigger to review runtime boundaries: what the tool can read, what it can execute, and what data it can send externally.

A simple post-read implementation loop

1) Capture the top three risks in plain language. 2) Add one measurable control for each risk. 3) Run a small pilot with logs enabled. 4) Review outcomes after one week and adjust policy before broad rollout. This loop keeps decisions evidence based and avoids overreaction. It also creates a repeatable pattern that works across different tools and changing vendor landscapes.

Finally, document exceptions explicitly. If you accept a risk for business reasons, record the reason, mitigation, and review date. Transparent exception handling is a major trust signal for internal stakeholders and external auditors. It also improves future decision speed because teams can reference prior reasoning instead of reopening the same debate every release cycle.

If you run recurring retrospectives, archive lessons learned from each implementation cycle. A lightweight internal knowledge base turns individual fixes into team capability and steadily lowers incident frequency over time.

Are your skills safe?

Don't guess. Run our free security scanner now.

Open Scanner