Securing Enterprise AI Agents: Anthropic's Approach to Credential Safety
The Credential Challenge in AI Agent Deployments
Enterprises have been cautious about connecting AI agents to internal APIs and databases, and the barrier isn’t the models—it’s the credentials. In typical production setups, the agent carries authentication tokens as it makes tool calls. If the agent is compromised or behaves unexpectedly, those credentials go with it, opening the door to serious security breaches.

Why Existing Approaches Fall Short
Many current solutions place the burden of credential management directly on the agent’s context. This means that any vulnerability in the agent—whether due to adversarial prompts, software bugs, or misconfiguration—can expose sensitive keys. The industry has lacked a clean architectural separation between the agent’s decision-making loop and the execution of privileged actions. That gap has left security teams uneasy about scaling AI agents across internal systems.
Anthropic's Dual Solution: Sandboxes and Tunnels
Anthropic is tackling this problem head-on for Claude Managed Agents with two new capabilities: self-hosted sandboxes and MCP tunnels. Together, they shift credential control from inside the agent to the network boundary—closing a major vulnerability vector.
Self-Hosted Sandboxes: Keeping Execution Inside the Perimeter
Self-hosted sandboxes allow enterprises to run tool execution within their own infrastructure. The agentic loop—orchestration, context management, and error recovery—remains on Anthropic’s platform, but the actual tool calls happen inside the enterprise’s trusted environment. This means the agent never holds the keys; it merely requests actions that the sandbox enforces. Files, packages, and credentials stay within the enterprise’s control. For orchestration teams, this translates into better performance because the sandbox can leverage local compute resources without relying on external connectivity.
MCP Tunnels: Connecting Without Exposing Keys
MCP tunnels provide a lightweight, outbound-only gateway that resides inside the organization’s network. When the agent needs to access a private MCP server, the tunnel establishes a secure connection without ever passing credentials through the agent’s context. Authentication is handled at the tunnel level, not inside the agent’s reasoning loop. This approach ensures that even if the agent is compromised, the attacker cannot extract the credentials—they remain locked within the network perimeter.
Architectural Separation: A Key Distinction
Anthropic draws a critical architectural line: the agent’s cognitive loop (decision-making) runs on Anthropic’s infrastructure, while tool execution runs on the enterprise’s systems. This separation is deeper than typical sandbox approaches, which often keep both the agent and its execution together. By splitting these layers, enterprises can more precisely map agent workflows to security zones.
Comparison with Other Providers
OpenAI recently introduced local execution for its Agents SDK in April, responding to similar enterprise demands. However, Anthropic’s approach differs by maintaining the agent loop on its own platform while delegating execution to enterprise-controlled sandboxes. This hybrid model gives teams the benefits of managed orchestration without sacrificing credential security. The self-hosted sandbox and MCP tunnel combination provides a flexible, defense-in-depth strategy that aligns with modern zero-trust principles.
Practical Steps for Orchestration Teams
For teams already using Claude Managed Agents, the recommended starting point is the sandbox feature. Move tool execution to your own infrastructure and verify the boundary enforcement before tackling MCP tunnels, which remain in research preview. New teams evaluating the platform should treat the sandbox as a foundation: configure it to restrict resource access and monitor logs for any anomalous tool calls. Once comfortable, integrate MCP tunnels for private server connectivity.
Orchestration teams gain more than just security from this architecture—they get finer control over how agents operate. The separation of concerns means sandboxes determine where tool execution occurs and what resources are available, while MCP tunnels decide how agents reach internal systems. By decoupling these, enterprises can tailor agent behavior to different regulatory and operational requirements across departments.
Availability and Next Steps
Self-hosted sandboxes are currently available in public beta for Claude Managed Agent users. MCP tunnels are in research preview, with wider rollout expected based on feedback. Anthropic encourages enterprises to experiment with sandboxes first, as the tooling and documentation are more mature. As the security architecture around AI agents continues to evolve, this credential-safe approach sets a new baseline for responsible enterprise deployment.
Related Articles
- How to Avoid Earnings Surprises: Lessons from Kyndryl's Stock Plunge
- Seamless Swift Development Across Modern Editors: A Guide to the Open VSX Extension
- 10 Critical Insights into AD CS Escalation: From Misconfigurations to Detection
- 12 Essential Insights into the Role of an Engineering Director
- Claude Code Revolution: FreeCodeCamp Launches Free Course on Anthropic's Terminal AI
- AWS Launches Claude Opus 4.7 on Bedrock and AWS Interconnect Goes GA
- 6 Astonishing Facts About Remoras and Their Intimate Bond with Manta Rays
- AWS Weekly Roundup: Claude Opus 4.7 Debuts in Bedrock and Interconnect Goes GA