Cloud-Native AI Security for MCP Servers
Staying ahead of threats: Advanced techniques for securing Model Context Protocol servers and beyond.
Things change quickly these days. Not only life in general but technology and hype topics even more. One of the latest developments is the Model Context Protocol (MCP) introduced by Anthropic. What was clearly thought of a way to connect Desktop LLM clients with additional data-sources quickly turned into the generic USB-C connector in AI-infused applications. And from a technology perspective this is totally understandable. While OpenAI “compatible” APIs turned into the standard way to communicate with Large Language Models, the desire to have something more “controllable” and standardized to include external data basically lies in the heart of every software architect and developer.
With this desire, I can already envision the amount of MCP Server endpoints exploding in various implementations and companies. While my last post looked a little at the opportunities it also outlined some of the weaknesses. And this follow-up will exclusively look at the security considerations around MCP servers.
And I do not care if your implementation leverages the speed and efficiency of a Quarkus-based Java architecture or another less cloud-native framework, the challenges remains the same: constructing a secure and scalable system that can carry enterprise grade and sensitive data. One that ensures the confidentiality of sensitive information, guards against malicious prompt injection attacks, and guarantees the operational resilience of your AI-driven services.
This article draws upon practical experiences and established security principles to present a comprehensive, multi-layered security strategy for MCP server endpoints. I will explore a range of protective measures, starting from application-level safeguards like authentication using OpenID Connect (OIDC), extending to network-level defenses such as network segmentation and hardware-based protections like Web Application Firewalls (WAFs) and Hardware Security Modules (HSMs). Furthermore, I will emphasize the critical role of integrating secure coding practices throughout your microservices ecosystem. And please make sure to understand this an an incomplete list of things. It is a start with the basics and some thoughts how to get closer to your goals while the official SDK does not specify anything around security at all.
A warning: This article contains some conceptual source code. It is mostly presented to help you understand certain aspects and keep you awake while reading. Please DO NEVER use any of this in production. NONE of this will pass a decent security review anyway.
When you are done reading this article and you feel like you need to also look at training and other aspects of Large Language Models and their security, I can warmly recommend the OWASP Top 10 lists. They also have one for LLM applications.
The OWASP Top 10 for Large Language Model Applications outlines the most significant security risks associated with building and deploying applications that leverage large language models. These risks range from prompt injection attacks, where malicious inputs manipulate the model's behavior, to insecure output handling that can lead to vulnerabilities like cross-site scripting. The list also highlights concerns like training data poisoning, model denial of service, supply chain vulnerabilities, excessive agency of the model, data leakage, overreliance on AI outputs, insecure plugins, and unbounded consumption of resources. This framework aims to educate developers and security professionals on these emerging threats and provide guidance on mitigation strategies to secure AI-powered applications.
Understanding the Security Challenges for MCP Servers
MCP servers can operate in an environment with specific security concerns that demand careful consideration. If they present access to company specific information, they will most likely handle confidential information. So with all the excitement to work on something cool and new, please keep the basic principle in mind:
The protection of this data from unauthorized access or disclosure is a fundamental security requirement for everything you expose.
Some measures you will have to take a look at include:
Mitigate Prompt Manipulation: A significant risk arises from the potential for users to craft malicious inputs aimed at manipulating the behavior of the underlying AI model. Robust defenses must be in place to sanitize user inputs and prevent these prompt injection attacks from compromising the system.
Protecting Critical Operational Components: MCP servers often serve as the central nervous system for AI-powered applications. Any security breach targeting these servers can have far-reaching consequences, potentially disrupting both data integrity and overall system operations.
Securing Interconnected Systems: MCP servers might not for long work in isolation. To infuse more use-case driven features into LLMs, they might als well interact with a multitude of other microservices and external resources to aggregate information. This interconnectedness necessitates the implementation of strong security measures to govern inter-service communication and prevent lateral movement of threats within your infrastructure.
Given these challenges, securing MCP servers requires more than just ensuring encrypted communication channels. It demands a holistic and thorough approach with robust authentication and authorization mechanisms, coupled with proactive threat mitigation strategies.
Foundational Security Principles
Establishing a strong security posture for MCP servers rests upon several core principles:
Secure Communication and Coding Practices
End-to-End Encryption with TLS: It is crucial to encrypt all communication to and from your MCP servers using Transport Layer Security (TLS). This measure acts as a shield against eavesdropping and prevents attackers from intercepting sensitive data or conducting man-in-the-middle attacks.
Verifying Identities with Mutual TLS (mTLS): For internal communication between your various microservices, implementing Mutual TLS (mTLS) adds an extra layer of security. This protocol ensures that both the client and the server involved in the communication can verify each other's identities, strengthening trust within your system.
Thorough Input Validation and Sanitization: A cornerstone of secure coding is the rigorous validation and sanitization of all incoming data. This practice is essential to prevent various forms of injection attacks, including sophisticated prompt injection attempts. By carefully inspecting and cleaning user inputs, you can ensure that malicious code or commands are neutralized before they can cause harm.
Robust Error Handling: Implementing comprehensive exception handling mechanisms is vital. These mechanisms should be designed to prevent the leakage of sensitive system details in the event of an error, thereby avoiding the exposure of potentially exploitable information to attackers.
Zero Trust and Least Privilege
Adopting a Zero Trust Architecture: In a Zero Trust model, no user or service is inherently trusted, regardless of their network location. Every request for access must be authenticated and authorized before being granted. This principle significantly reduces the attack surface and limits the potential damage from compromised accounts.
Applying the Principle of Least Privilege: This fundamental security principle dictates that users and services should only be granted the minimum level of access rights necessary to perform their intended functions. By limiting privileges, you minimize the potential impact of a security breach, as a compromised entity will have restricted capabilities.
Authentication and Authorization with OIDC
OpenID Connect (OIDC) has emerged as a widely adopted standard for identity verification, building upon the OAuth 2.0 authorization framework. Its token-based approach, often utilizing JSON Web Tokens (JWTs), makes it an ideal solution for securing MCP server endpoints in a stateless and scalable manner.
Implementing OIDC in a Quarkus MCP Server
Quarkus, with its focus on developer productivity and native cloud capabilities, provides seamless integration with OIDC through its extensions. Here’s an illustrative example of how you might secure an MCP resource using OIDC in Quarkus:
@Path("/mcp")
@Authenticated
public class MCPResource {
@Inject
SecurityIdentity securityIdentity;
@POST
@Path("/context")
@RolesAllowed({"ai-engineer", "data-scientist"})
public Response processContext(ContextRequest request) {
// Retrieve identity information
String username = securityIdentity.getPrincipal().getName();
// Process the contextual data securely
return Response.ok().build();
}
}
To configure your OIDC provider and define role-based access control policies within your Quarkus application, you would typically modify your application.properties
file:
# OIDC Configuration
quarkus.oidc.auth-server-url=https://your-auth-server/auth/realms/your-realm
quarkus.oidc.client-id=your-client-id
quarkus.oidc.credentials.secret=your-client-secret
# Enable policy enforcement
quarkus.oidc.policy-enforcement.enabled=true
# Define allowed roles
quarkus.security.roles.allowed=ai-engineer,data-scientist,admin
Essential Practices for OIDC
Employing Short-Lived Tokens: To minimize the window of opportunity for misuse, it's recommended to use OIDC tokens with relatively short expiration times.
Rigorously Validating Tokens: Upon receiving an OIDC token, your MCP server must thoroughly verify its signature, expiration date, and the claims it contains to ensure its authenticity and integrity.
Designing Roles Based on Need: Carefully define roles that accurately reflect the operational responsibilities within your organization. This granular approach ensures that users only have access to the resources and functionalities necessary for their roles.
Securely Storing Credentials: Protecting the client secrets used for OIDC communication is paramount. Employ secure storage mechanisms to prevent unauthorized access to these sensitive credentials.
Implementing Refresh Token Rotation: Regularly rotating refresh tokens, which are used to obtain new access tokens, can further limit the potential for misuse if a token is compromised.
What absolutely presents a challenge here is that the official MCP SDK does not define any security measures for clients or servers. This means, you will have to implement this on both sides yourself.
Network Security and Infrastructure Protections
Beyond securing the application layer, safeguarding your MCP servers at the network level is equally critical.
Web Application Firewalls (WAFs)
Web Application Firewalls (WAFs) act as sentinels, deployed as dedicated hardware or virtual appliances, to monitor, filter, and block potentially malicious HTTP/HTTPS traffic before it can reach your servers. These are the BIG GUNS when it comes to security. And you will or will not get a ton of support implementing them or even curse about them when you have to use them.
They offer several advantages:
Customized AI-Specific Rules: WAFs can be configured with rule sets specifically designed to detect and block certain user interaction pattern. Depending on the transport layer they are looking it, it might be possible to even detect certain prompt injection attacks. But given the different types of prompts (user, system, etc.) it might be a little to fickle to try gluing this into rules.
Rate Limiting Capabilities: To defend against Denial-of-Service (DoS) attacks, WAFs can implement rate limiting, restricting the number of requests from a single source within a given timeframe.
Identifying Unusual Activity: WAFs often incorporate anomaly detection capabilities, allowing them to identify and flag unusual patterns in network traffic that might indicate a security threat.
Offloading Encryption Tasks: WAFs can handle TLS termination, decrypting incoming traffic for inspection and then re-encrypting it before forwarding it to your servers. This offloading can reduce the processing load on your MCP server endpoint processes.
Some very simple rate-limiting could exemplarily be implemented in in Java with Quarkus:
@Provider
public class RateLimitFilter implements ContainerRequestFilter {
@Inject
RateLimiter rateLimiter;
@Override
public void filter(ContainerRequestContext requestContext) throws IOException {
String clientIp = requestContext.getHeaderString("X-Forwarded-For");
if (clientIp == null) {
clientIp = requestContext.getUriInfo().getRequestUri().getHost();
}
if (!rateLimiter.allowRequest(clientIp)) {
requestContext.abortWith(
Response.status(Response.Status.TOO_MANY_REQUESTS)
.entity("Rate limit exceeded")
.build());
}
}
}
Network Segmentation and Service Meshes
Another more physical aspect of security can be network segmentation with different mechanisms. While it is always a very good idea to segment your networks into different protective zones, and enforce strict access controls between them, applying them with MCP server and LLM model AND application deployments is even more important. I see way too many people just spinning up model containers next to their applications and putting them on VPS. Don’t do that.
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: mcp-server-network-policy
spec:
podSelector:
matchLabels:
app: mcp-server
ingress:
- from:
- podSelector:
matchLabels:
app: api-gateway
ports:
- protocol: TCP
port: 8080
egress:
- to:
- podSelector:
matchLabels:
app: model-service
ports:
- protocol: TCP
port: 8080
Service mesh technologies like Istio or Linkerd provide an additional layer of security for microservices. They can automatically enforce Mutual TLS (mTLS) for inter-service communication and allow for the definition of fine-grained traffic policies, further strengthening the security posture of your MCP server within your microservices architecture.
Application-Level Security Enhancements
Beyond basic authentication and authorization, several application-level security measures can significantly bolster the defenses of your MCP servers.
Input Validation and Sanitization
Implementing robust input validation logic is paramount in preventing prompt injection and other forms of malicious input. Consider the following example of a ContextValidator
in Java:
@ApplicationScoped
public class ContextValidator {
public void validateContext(ContextRequest request) {
if (containsInjectionPatterns(request.getPrompt())) {
throw new InvalidRequestException("Potential prompt injection detected");
}
if (request.getMaxTokens() > 4096) {
throw new InvalidRequestException("Maximum token limit exceeded");
}
// Additional validations can be added here
}
private boolean containsInjectionPatterns(String prompt) {
return prompt.contains("ignore previous instructions") ||
prompt.contains("system: ");
}
}
Secure Storage of Context Data
Protecting the context data stored and processed by your MCP servers is crucial:
Encryption at Rest: Ensure that all persistent storage volumes used by your MCP servers are encrypted to protect data when it is not being actively accessed.
Data Minimization: Only retain the context data that is absolutely necessary for the functioning of your AI models. Reducing the amount of stored data minimizes the potential impact of a data breach.
Automatic Data Removal: Implement policies and mechanisms for automatically purging outdated or no longer needed context data on a regular basis.
Here's a conceptual (!!) example of a SecureContextStorage
service that handles encryption:
@ApplicationScoped
public class SecureContextStorage {
@Inject
EncryptionService encryptionService;
public void storeContext(String contextId, String contextData, Duration ttl) {
String encryptedData = encryptionService.encrypt(contextData);
cache.put(contextId, encryptedData, ttl);
}
public String retrieveContext(String contextId) {
String encryptedData = cache.get(contextId);
return encryptedData == null ? null : encryptionService.decrypt(encryptedData);
}
}
Kubernetes-Native Security
When deploying your MCP servers within a Kubernetes environment, leverage the platform's built-in security features:
Pod Security Contexts: Configure Pod Security Contexts to enforce security policies at the pod and container level. This includes running containers as non-root users, restricting capabilities, and making the root filesystem read-only.
apiVersion: apps/v1
kind: Deployment
metadata:
name: mcp-server
spec:
template:
spec:
securityContext:
runAsNonRoot: true
runAsUser: 1000
readOnlyRootFilesystem: true
allowPrivilegeEscalation: false
containers:
- name: mcp-server
securityContext:
capabilities:
drop:
- ALL
Role-Based Access Control (RBAC) and Service Accounts: Apply the principle of least privilege to all service accounts and user roles within your Kubernetes cluster, granting only the necessary permissions.
Secrets Management: Utilize Kubernetes Secrets or dedicated secrets management solutions like HashiCorp Vault to securely store and manage sensitive information such as API keys and passwords.
Container Image Scanning: Integrate automated security scanning into your CI/CD pipeline to identify and address vulnerabilities in your container images before they are deployed.
Secure Integration with LangChain4j
If your MCP server integrates with LangChain4j, a framework for building language model applications, it's crucial to consider additional security measures. Ensure that any tokens used for authentication with LangChain4j are properly validated and that security filters are applied to prompts processed by the AI model (again a conceptual code snippet!):
@ApplicationScoped
public class SecureLangChainService {
@Inject
TokenValidator tokenValidator;
@Inject
ChatLanguageModel chatModel;
public String processWithLangChain(String input, String authToken) {
if (!tokenValidator.isValid(authToken)) {
throw new SecurityException("Invalid token");
}
PromptTemplate promptTemplate = PromptTemplate.from("{{input}}").withSecurityFilters();
AiMessage response = chatModel.generate(promptTemplate.apply(input));
return response.text();
}
}
Logging, Monitoring, and Audit Trails
A robust security strategy fundamentally relies on the ability to understand what is happening within the system at all times, and this is where comprehensive logging, monitoring, and detailed audit trails come into play. Logging provides the raw, granular data – a historical record of events, transactions, errors, and access attempts across all components of the system. Without this foundation, any attempt to understand security incidents or identify potential threats becomes an endless search for a needle in a haystack.
Monitoring takes this raw data and actively analyzes it, often in real-time, to identify patterns, anomalies, and deviations from normal behavior. This proactive approach allows security teams to detect and respond to suspicious activities as they occur, potentially preventing breaches or minimizing their impact. For instance, an unusual spike in login attempts from a specific IP address or unauthorized access to sensitive files would trigger alerts through monitoring systems, enabling immediate investigation.
Finally, detailed audit trails provide an immutable record of specific actions performed by users and the system itself. This includes who accessed what data, when changes were made, and what configurations were altered. In the aftermath of a security incident, audit trails are invaluable for forensic analysis, allowing investigators to reconstruct the sequence of events, determine the scope of the compromise, and identify the responsible parties. Moreover, these records are often crucial for meeting compliance requirements and demonstrating due diligence in protecting sensitive information. In essence, logging provides the necessary information, monitoring offers real-time awareness and alerting, and audit trails ensure accountability and facilitate post-incident analysis, all working in concert to create a strong and resilient security posture.
Structured Logging: Implement structured logging, preferably using a format like JSON, to facilitate easier parsing and analysis of log data.
Sensitive Data Filtering in Logs: Take care to ensure that sensitive information is not inadvertently captured in your application logs.
Centralized Log Management: Aggregate logs from all your MCP server instances into a centralized system. This allows for real-time threat detection and facilitates security investigations.
Detailed Audit Trails: Maintain thorough records of user actions and system events. These audit trails are invaluable for forensic analysis in the event of a security incident.
Here's a conceptual(!!) example of a secure logging interceptor in Java:
@Interceptor
public class SecureLoggingInterceptor {
@Inject
Logger logger;
@AroundInvoke
public Object logSecurely(InvocationContext context) throws Exception {
logger.info(JsonbBuilder.create().toJson(Map.of(
"method", context.getMethod().getName(),
"timestamp", System.currentTimeMillis(),
"user", getCurrentUser(),
"parameters", sanitizeParameters(context.getParameters())
)));
try {
Object result = context.proceed();
return result;
} catch (Exception e) {
logger.error("Operation failed", e);
throw e;
}
}
private Object sanitizeParameters(Objectparameters) {
// Implement sanitization logic as needed
return parameters;
}
}
And another conceptual example of an AuditService
. And yes, You should probably find better and less performance relevant ways of implementing this in a message driven approach. Just checking, if you are still with me ;)
@ApplicationScoped
public class AuditService {
@Inject
EntityManager em;
public void logAccess(String userId, String operation, String resourceId, boolean succeeded) {
AuditRecord record = new AuditRecord();
record.setUserId(userId);
record.setOperation(operation);
record.setResourceId(resourceId);
record.setSucceeded(succeeded);
record.setTimestamp(LocalDateTime.now());
record.setIpAddress(getClientIp());
em.persist(record);
}
}
Hardware Security Modules (HSMs) and Advanced Protections
For organizations with stringent security requirements, incorporating hardware-based security solutions can provide an additional layer of protection. These beasts are not only expensive but something you will only touch when working applications in really high security areas. Still good to know that you have them in your pocket if you need them.
Web Application Firewalls (WAFs): As previously discussed, WAFs play a crucial role in filtering and monitoring network traffic at the perimeter.
Hardware Security Modules (HSMs): HSMs are dedicated hardware devices designed to securely manage cryptographic keys and perform cryptographic operations. They offer a high level of protection for sensitive keys.
Here's a last conceptual example of how you might implement an HSM-based encryption service:
@ApplicationScoped
public class HSMBasedEncryption {
@Inject
@Named("hsm-provider")
CryptoProvider hsmProvider;
public String encrypt(String data, String keyIdentifier) {
try {
Key key = hsmProvider.getKey(keyIdentifier);
Cipher cipher = Cipher.getInstance("AES/GCM/NoPadding", hsmProvider);
cipher.init(Cipher.ENCRYPT_MODE, key);
byteiv = cipher.getIV();
byteencrypted = cipher.doFinal(data.getBytes(StandardCharsets.UTF_8));
ByteBuffer buffer = ByteBuffer.allocate(iv.length + encrypted.length);
buffer.put(iv);
buffer.put(encrypted);
return Base64.getEncoder().encodeToString(buffer.array());
} catch (Exception e) {
throw new SecurityException("Encryption failed", e);
}
}
}
Final thoughts
Securing Model Context Protocol servers within an AI-driven enterprise demands a comprehensive security strategy that integrates robust authentication and authorization mechanisms, fortified network configurations, and stringent application-level validations.
You will have to look at all the layers of your final applications to get towards a stringent security posture for your applications and I hope this article helped you in understanding a little more around this. And I do also hope it prevents one or two experiments quickly hacked together in Python to enter production in an insecure way.