Claude Managed Agents: The Complete Developer Guide (2026)

By AI Workflows Team · April 10, 2026 · 13 min read

Claude Managed Agents (public beta, April 2026) is Anthropic's fully-managed infrastructure for autonomous AI agents. Learn the architecture, quickstart code, pricing breakdown, and honest vendor lock-in analysis.

Claude Managed Agents: The Complete Developer Guide (2026)

TL;DR: Claude Managed Agents (public beta, April 8, 2026) is Anthropic's fully-managed infrastructure for running autonomous AI agents. Instead of building your own sandbox, agent loop, and session state, you define an Agent config, spin up an Environment, and start Sessions — Claude handles the rest. Pricing: $0.08/session-hour + standard token costs. Best for long-running, production agent workloads.


What Is Claude Managed Agents?

On April 8, 2026, Anthropic launched something that isn't a new model — it's a new way to run Claude.

Claude Managed Agents is a hosted agent infrastructure platform. You define your agent's configuration (model, system prompt, tools, MCP servers), set up a cloud environment pre-loaded with whatever packages you need, and start sessions. Claude runs autonomously inside a sandboxed container — reading files, executing code, browsing the web, completing multi-step tasks — while your application streams results in real time.

The operative word is managed. You don't write the agent loop. You don't provision sandboxes. You don't wire up error recovery, session checkpointing, or credential vaulting. Anthropic handles all of that. Your job is to define what the agent should do and feed it tasks.

According to Anthropic's internal benchmarks, structured file generation tasks showed a 10-point improvement in task success rates compared to standard Claude API prompting. Five named enterprise customers — Notion, Rakuten, Asana, Vibecode, and Sentry — shipped production workloads before the public beta even launched.


Claude Agent SDK vs. Claude Managed Agents: Which Do You Need?

This is the question developers ask first, and almost no existing article answers it directly. Anthropic offers two distinct ways to run Claude as an autonomous agent:

Claude Agent SDK (local) Claude Managed Agents (hosted)
What it is Library to build your own agent loop Fully managed hosted infrastructure
Runs where Your own server / machine Anthropic's cloud containers
Infrastructure You build and maintain Anthropic provides and maintains
Agent loop You write it Pre-built and optimized
Sandboxing You implement or choose Built-in, secure by default
Session state You manage Persisted server-side
Multi-model Yes — Claude, GPT, local models Claude only
Pricing Token cost only $0.08/session-hour + tokens
Vendor lock-in Low High (Anthropic infra only)
Best for Custom control, multi-model routing Long-running tasks, minimal infra

Use the Claude Agent SDK when:

  • You need to route between multiple AI providers
  • You want fine-grained control over the agent loop (the SDK is the same engine powering Claude Code)
  • You have existing infrastructure and just need the intelligence layer
  • Self-hosted or on-prem deployment is a requirement

Use Claude Managed Agents when:

  • You want to ship an agent-powered feature in days, not months
  • Your tasks run for minutes to hours and require persistent state
  • You don't want to own sandbox infrastructure or error-recovery logic
  • You're building on Anthropic's platform and prioritize speed over portability

How It Works: The Architecture

The Anthropic engineering team published a detailed breakdown of the design philosophy behind Managed Agents. Understanding the architecture helps you use the platform more effectively.

Three Decoupled Components

Every Managed Agents session is composed of three independently managed pieces:

Claude Managed Agents decoupled architecture — Session, Harness, and Sandbox as separate components

  • Session — An append-only event log. The source of truth for everything that has happened in a task.
  • Harness — The stateless loop that calls Claude, routes tool calls, and manages execution flow.
  • Sandbox — The isolated container where code actually runs.

The key insight: these three components can fail and restart independently. If the harness crashes mid-task, a new harness picks up the session log and resumes. If the sandbox gets killed, the session survives. This is what makes long-running agents reliable in production.

Anthropic borrowed a framing from DevOps to describe the old approach: "pets vs. cattle." Early agent designs coupled everything in a single container — a precious pet you couldn't afford to lose. The decoupled architecture treats each component as interchangeable cattle: disposable, replaceable, and far more resilient at scale.

The Session Is Not Claude's Context Window

This is a distinction that matters in practice:

Session log vs. Claude context window — the session log is persistent external storage, separate from Claude's active context

The session log stores every event that has ever occurred in a task. Claude's context window is a selective view of that log — the harness uses getEvents() to retrieve what's relevant and passes it to Claude. This means:

  • Sessions can run for hours without hitting context limits (the harness handles compaction automatically)
  • You never have to make irreversible deletion decisions mid-task
  • Historical events remain queryable even after they've left Claude's active window

Many Brains, Many Hands

The decoupled design also enables horizontal scaling:

Many brains, many hands — multiple harness instances routing to multiple sandbox containers via a standardized execute() interface

Harnesses (brains) are stateless and spin up in milliseconds. Sandboxes (hands) provision lazily — only when a tool call actually requires execution. This lazy provisioning is why Anthropic reports a ~60% reduction in p50 time-to-first-token and over 90% reduction in p95 TTFT compared to their earlier coupled designs.


Core Concepts

Concept What It Is
Agent Reusable, versioned config: model + system prompt + tools + MCP servers
Environment Container template: pre-installed packages + networking rules
Session A running agent instance in an environment; maintains conversation history
Events Messages between your app and the agent: user turns, tool results, status updates

Agents and Environments are defined once and reused. Sessions are ephemeral instances — each task gets its own isolated session with a dedicated container.


Quickstart: Your First Agent in 5 Minutes

Prerequisites

  • An Anthropic API key (all accounts have Managed Agents access by default)
  • Python ≥ 3.8 or Node.js ≥ 18

Install the SDK

# Python
pip install anthropic

# TypeScript / Node.js
npm install @anthropic-ai/sdk

Or install the ant CLI for interactive management:

# macOS
brew install anthropics/tap/ant
xattr -d com.apple.quarantine "$(brew --prefix)/bin/ant"
ant --version

Step 1 — Create an Agent

An agent is a reusable configuration. Create it once and reference its ID in every session.

from anthropic import Anthropic

client = Anthropic()

agent = client.beta.agents.create(
    name="Coding Assistant",
    model="claude-sonnet-4-6",
    system="You are a helpful coding assistant. Write clean, well-documented code.",
    tools=[
        {"type": "agent_toolset_20260401"},  # Enables bash, file ops, web search
    ],
)

print(f"Agent ID: {agent.id}, version: {agent.version}")

The agent_toolset_20260401 type grants access to all built-in tools: Bash, file read/write/edit/glob/grep, web search, and web fetch. You can also selectively specify individual tools or connect MCP servers.

TypeScript equivalent:

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();

const agent = await client.beta.agents.create({
  name: "Coding Assistant",
  model: "claude-sonnet-4-6",
  system: "You are a helpful coding assistant. Write clean, well-documented code.",
  tools: [{ type: "agent_toolset_20260401" }],
});

console.log(`Agent ID: ${agent.id}, version: ${agent.version}`);

Step 2 — Create an Environment

Environments define the cloud container your agent runs in. Specify packages and networking:

environment = client.beta.environments.create(
    name="quickstart-env",
    config={
        "type": "cloud",
        "networking": {"type": "unrestricted"},  # Full outbound internet access
    },
)

For data science or backend tasks, pre-install any packages:

environment = client.beta.environments.create(
    name="data-analysis",
    config={
        "type": "cloud",
        "packages": {
            "pip": ["pandas", "numpy", "scikit-learn"],
            "npm": ["express"],
        },
        "networking": {"type": "unrestricted"},
    },
)

Supported package managers: pip, npm, apt, cargo, gem, go.

For stricter security, use limited networking with an allowlist:

config = {
    "type": "cloud",
    "networking": {
        "type": "limited",
        "allowed_hosts": ["api.example.com"],
        "allow_mcp_servers": True,
        "allow_package_managers": True,
    },
}

Step 3 — Start a Session

session = client.beta.sessions.create(
    agent=agent.id,
    environment_id=environment.id,
    title="Fibonacci generator task",
)

print(f"Session ID: {session.id}")

Step 4 — Send a Message and Stream the Response

with client.beta.sessions.events.stream(session.id) as stream:
    # Send user message after stream opens
    client.beta.sessions.events.send(
        session.id,
        events=[{
            "type": "user.message",
            "content": [{
                "type": "text",
                "text": "Create a Python script that generates the first 20 Fibonacci numbers and saves them to fibonacci.txt",
            }],
        }],
    )

    # Process streaming events
    for event in stream:
        match event.type:
            case "agent.message":
                for block in event.content:
                    print(block.text, end="")
            case "agent.tool_use":
                print(f"\n[Tool: {event.name}]")
            case "session.status_idle":
                print("\n\nDone.")
                break

The agent will write the script, execute it in the sandbox, verify the output, and report completion — all autonomously. You stream results in real time via SSE.

TypeScript equivalent:

const stream = await client.beta.sessions.events.stream(session.id);

await client.beta.sessions.events.send(session.id, {
  events: [{
    type: "user.message",
    content: [{
      type: "text",
      text: "Create a Python script that generates the first 20 Fibonacci numbers and saves them to fibonacci.txt",
    }],
  }],
});

for await (const event of stream) {
  if (event.type === "agent.message") {
    for (const block of event.content) process.stdout.write(block.text);
  } else if (event.type === "agent.tool_use") {
    console.log(`\n[Tool: ${event.name}]`);
  } else if (event.type === "session.status_idle") {
    console.log("\n\nDone.");
    break;
  }
}

Built-in Tools

Tool What It Does
Bash Run shell commands in the container
File operations Read, write, edit, glob, grep files
Web search Search the web ($10 / 1,000 searches)
Web fetch Retrieve URL content (token cost only, no surcharge)
MCP servers Connect to any external tool provider

You can also define custom tools and connect them via the tools array in your agent config. MCP servers work identically to how they work in Claude desktop integrations.


Agent Versioning

One underappreciated feature: agents are versioned automatically. Every update() call produces a new version, and you can pin sessions to specific versions — critical for production stability:

# Update system prompt → creates new version
updated = client.beta.agents.update(
    agent.id,
    version=agent.version,
    system="You are a helpful coding agent. Always write tests.",
)
print(f"New version: {updated.version}")

# Pin a session to the previous version for safe rollout
stable_session = client.beta.sessions.create(
    agent={"type": "agent", "id": agent.id, "version": 1},
    environment_id=environment.id,
)

List all versions to track your agent's evolution:

for v in client.beta.agents.versions.list(agent.id):
    print(f"Version {v.version}: {v.updated_at.isoformat()}")

This versioning model lets you test new system prompts in staging while production sessions run uninterrupted on the previous version — a pattern that matters when sessions are long-running.


Session Statuses and Error Handling

Status Meaning
idle Waiting for input. Sessions start here.
running Actively calling Claude or executing tools.
rescheduling Transient error — automatically retrying.
terminated Unrecoverable error — session ended.

The rescheduling status means the managed infrastructure is handling transient failures on your behalf — you don't need to implement retry logic. Only terminated requires your intervention:

session = client.beta.sessions.retrieve(session_id)
if session.status == "terminated":
    # Check Console for trace; consider starting a new session
    print("Session terminated — see Console for trace data")

Pricing: What Does It Actually Cost?

Claude Managed Agents charges on two dimensions: session runtime + standard token costs.

Session Runtime

$0.08 per session-hour, metered to the millisecond. The meter runs only while status is runningidle, rescheduling, and terminated time is free.

Token Costs

Standard Claude API rates apply inside sessions:

Model Input Output Cache Read
Claude Opus 4.6 $5 / MTok $25 / MTok $0.50 / MTok
Claude Sonnet 4.6 $3 / MTok $15 / MTok $0.30 / MTok
Claude Haiku 4.5 $1 / MTok $5 / MTok $0.10 / MTok

Prompt caching applies inside sessions identically to the Messages API — a significant cost lever for long-running tasks that repeatedly reference the same context.

Web Search

$10 per 1,000 searches. Failed searches are not billed. Web fetch has no surcharge beyond tokens.

Real-World Cost Examples

30-minute Sonnet 4.6 session, 30k input / 10k output tokens:

  • Input: $0.09 | Output: $0.15 | Runtime: $0.04 | Total: ~$0.28

1-hour Opus 4.6 session, 50k input / 15k output tokens:

  • Input: $0.25 | Output: $0.375 | Runtime: $0.08 | Total: ~$0.71

Same Opus session with prompt caching (40k of 50k input cached):

  • Cache read: $0.02 | Uncached: $0.05 | Output: $0.375 | Runtime: $0.08 | Total: ~$0.53

1,000 automated support resolutions (~4k tokens each, Sonnet 4.6):

  • Tokens: ~$16 | Runtime (avg 5 min each): ~$6.67 | Total: ~$23

For typical developer automation tasks (15–30 minute sessions, moderate token use), expect $0.10–$0.50 per session with Sonnet 4.6. Batch API discounts do not apply — sessions are stateful and interactive. Volume discounts available via Anthropic enterprise.


Real-World Use Cases

Anthropic's launch customers illustrate the breadth of what Managed Agents enables at production scale:

Notion — Claude handles delegated work directly inside the workspace: engineers shipping code, knowledge workers generating websites and presentations, parallel tasks across teams.

Rakuten — Deployed specialist agents across product, sales, marketing, and finance, integrated with Slack and Microsoft Teams. Each specialist agent ships in approximately one week.

Asana — Built "AI Teammates" that collaborate with humans on tasks inside Asana projects. Advanced features developed significantly faster than alternative approaches.

Vibecode — Powers AI-native app generation; achieved 10× faster infrastructure deployment for end users.

Sentry — Paired a debugging agent with a patch-writing agent: bug flag → reviewable PR fix, automated end-to-end. Deployment timeline dropped from months to weeks.

Beyond enterprise, the developer community is finding compelling use cases for smaller-scale deployments:

  • Automated code review pipelines — trigger a review session on every PR, stream results to GitHub comments
  • Nightly batch jobs — summarize, analyze, and produce reports from logs and metrics while you sleep
  • Research assistants — agents that spend hours gathering, cross-referencing, and synthesizing information
  • Document processing — extract, transform, and validate structured data from unstructured inputs at scale

The Honest Vendor Lock-In Assessment

The Hacker News community flagged vendor lock-in as the top concern the day Managed Agents launched (163 points, 72 comments). It deserves a direct answer.

The concern is legitimate. Claude Managed Agents runs exclusively on Anthropic's infrastructure. There is no multi-cloud deployment, no self-hosted option, no bring-your-own-sandbox. You cannot use GPT-4 or a local model as the brain inside a Managed Agents session.

The counter-argument is also legitimate. If you're building on Claude, you already accept model-layer dependency. The infrastructure dependency is additive, not categorically new. And the alternative — spending months building your own sandboxing, session state, error recovery, and credential management — is its own form of lock-in (to your own engineering roadmap).

Mitigation strategies if portability matters:

  1. Abstract the agent interface — wrap your Managed Agents calls behind a thin adapter layer so the orchestration logic can be replaced without rewriting business logic
  2. Separate agent config from business logic — keep system prompts and task definitions decoupled from infrastructure calls
  3. Build egress paths — session event logs are queryable; design your data model so outputs can be consumed regardless of what ran them
  4. Watch the research preview features — multi-agent, outcomes, and memory are still in preview, meaning their APIs could change; don't build critical production flows on preview features yet

The decision is build vs. buy: if your team's competitive advantage is the product built on top of agents — not the agent infrastructure itself — Managed Agents is a rational trade.


Research Preview Features

Three capabilities are currently in research preview (request access at claude.com/form/claude-managed-agents):

Feature What It Does
Multi-agent Agents that spawn and direct sub-agents for parallel workloads
Outcomes Self-evaluation: agents iterate until they meet defined success criteria
Memory Persistent memory across sessions for stateful long-horizon agents

The outcomes feature is particularly significant — it closes the loop on task completion without requiring human verification of every result. Combined with multi-agent coordination, it opens up workflows where Claude both executes and validates its own work at scale.


Claude Managed Agents vs. Claude Code

Claude Code and Claude Managed Agents share architecture — the Managed Agents harness is related to the same engine that powers Claude Code's agentic loop. But they serve fundamentally different purposes:

Claude Code Claude Managed Agents
Purpose Developer productivity tool Hosted infra for your agent-powered products
User You — in your terminal/IDE Your users — via your application
Deployment Local or claude.ai Anthropic cloud containers
Primary API CLI + Agent SDK REST API / SDK

For building a product where your users interact with an autonomous agent, Managed Agents is the right foundation. For your own coding workflow, Claude Code is what you want. For a comparison of today's top AI coding tools, see our Best AI Coding Tools 2026 guide.


Getting Started

Claude Managed Agents is in public beta — all Anthropic API accounts have access by default. The SDK sets the required beta header automatically when you use client.beta.* endpoints.

If you prefer manual HTTP calls, add this header to every request:

anthropic-beta: managed-agents-2026-04-01

Key API endpoints:

POST   /v1/agents                      — create agent
POST   /v1/agents/{id}                 — update agent
POST   /v1/environments                — create environment
POST   /v1/sessions                    — start session
POST   /v1/sessions/{id}/events        — send events
GET    /v1/sessions/{id}/stream        — SSE event stream

For research preview access (multi-agent, outcomes, memory): claude.com/form/claude-managed-agents


FAQ

Is Claude Managed Agents available to all API users?

Yes. As of the April 2026 public beta launch, all Anthropic API accounts have access by default — no waitlist, no separate approval.

What is the difference between a session and a conversation?

A conversation is a list of messages. A session is a full running agent instance with its own isolated container, persistent file system, tool execution context, and durable event log. Sessions span many tool calls and interactions over minutes or hours.

Can I use Claude Managed Agents with models other than Claude?

No. Claude Managed Agents is Claude-only infrastructure. For multi-model routing across Claude, GPT, and local models, use the Messages API with your own orchestration layer, or the Agent SDK for a Claude-native local approach.

How do I manage secrets and API keys inside sessions?

Don't store credentials in system prompts or environment variables directly in sessions. Use the vault system: store OAuth tokens and credentials in secure vaults external to sandboxes, then pass vault_ids when creating a session. The infrastructure routes tool calls through credential-aware proxies.

Does the $0.08/session-hour charge apply while the agent is idle?

No. The session-hour meter runs only while the session status is running. Time spent idle, rescheduling, or terminated is not billed.

What happens if my session crashes mid-task?

The rescheduling status indicates the infrastructure is automatically recovering. If it reaches terminated, the session log is preserved — you can inspect the full event history in the Claude Console or via the API. For critical production flows, consider building session-resume logic that re-sends context to a new session on terminated.

Is there a free tier?

New Anthropic API accounts receive a small free credit. There is no ongoing free tier for Managed Agents. Volume discounts and extended enterprise trials are available via sales@anthropic.com.


Sources & References