Skip to main content
Sub-agents run concurrently in background tokio tasks with isolated budgets. They can use different providers — a Claude parent can spawn GPT or Gemini sub-agents. The parent monitors, collects results, and cancels children as needed.

Feature flag

Sub-agents require the sub-agents feature flag, which is enabled by default in both the CLI and the facade crate.
# In Cargo.toml
meerkat = { features = ["sub-agents"] }
At runtime, the AgentFactory has enable_subagents (default: false). Per-request builds can override via AgentBuildConfig::override_subagents.
let factory = AgentFactory::new(store_path)
    .builtins(true)
    .subagents(true);

Available tools

ToolDescription
agent_spawnCreate a sub-agent with clean context (just the prompt)
agent_forkCreate a sub-agent that inherits the full conversation history
agent_statusGet the status and output of a sub-agent by ID
agent_cancelCancel a running sub-agent
agent_listList all sub-agents with their current states

Spawn vs fork

Creates a new sub-agent with a clean context. The sub-agent starts fresh with only the provided prompt — it does not inherit conversation history.Use for: Independent tasks that do not need the parent’s context.
prompt
string
required
Initial prompt/task for the sub-agent.
provider
string
default:"anthropic"
LLM provider: anthropic, openai, gemini.
model
string
Model name. Must be in the allowlist for the provider.
tool_access
object
default:"inherit"
Tool access policy (see below).
budget
object
default:"50,000 tokens"
Budget limits: { max_tokens, max_turns, max_tool_calls }.
system_prompt
string
Override the system prompt for this sub-agent.
host_mode
boolean
default:"false"
Keep agent alive processing comms messages after initial prompt. Requires comms feature.
Response:
{
  "agent_id": "019467d9-7e3a-7000-8000-000000000000",
  "name": "sub-agent-019467d97e3a",
  "provider": "anthropic",
  "model": "claude-sonnet-4-5",
  "state": "running",
  "message": "Sub-agent spawned successfully. Use agent_status to check progress."
}

Tool access policies

Both agent_spawn and agent_fork accept a tool_access parameter:
PolicyDescription
{"policy": "inherit"}Inherit all tools from the parent (default).
{"policy": "allow_list", "tools": ["shell", "task_list"]}Only allow the specified tools.
{"policy": "deny_list", "tools": ["shell"]}Block the specified tools, allow everything else.
The tool_access value can be provided as either a JSON object or a JSON-encoded string (for LLMs that stringify nested objects).
Sub-agents do NOT receive sub-agent tools (agent_spawn, agent_fork, etc.) in their tool set. This is enforced by the factory, which clones itself with .subagents(false) when building the sub-agent’s tool dispatcher. This prevents uncontrolled recursive spawning (though nested spawning can be enabled via config — see concurrency limits).

Monitoring sub-agents

Query a single sub-agent by its UUID.Parameters: { "agent_id": "<UUID>" }
{
  "agent_id": "019467d9-7e3a-7000-8000-000000000000",
  "state": "completed",
  "output": "The analysis shows...",
  "is_final": true,
  "duration_ms": 12500,
  "tokens_used": 3200
}
States: running, completed, failed, cancelled.When is_final: true, the sub-agent is done. The output field contains the result text (or error field for failures).
List all sub-agents with summary counts.Parameters: None (empty object).
{
  "agents": [
    {
      "id": "019467d9-7e3a-7000-8000-000000000000",
      "name": "sub-agent-019467d97e3a",
      "state": "running",
      "depth": 1,
      "running_ms": 5000
    }
  ],
  "running_count": 1,
  "completed_count": 0,
  "failed_count": 0,
  "total_count": 1
}
Cancel a running sub-agent. Only agents in the running state can be cancelled.Parameters: { "agent_id": "<UUID>" }Response: { "success": true, "previous_state": "running", "message": "Sub-agent cancelled successfully" }

Concurrency limits

Concurrency limits:
FieldTypeDefaultDescription
max_depthu323Maximum nesting depth for sub-agents
max_concurrent_opsusize32Maximum concurrent operations (across all types)
max_concurrent_agentsusize8Maximum concurrently running sub-agents
max_children_per_agentusize5Maximum children a single agent can spawn
The SubAgentManager enforces these limits. Attempting to spawn beyond the limits returns an error.
FieldTypeDefaultDescription
default_providerString"anthropic"Default provider when not specified
default_modelOption<String>NoneDefault model (uses first in allowlist if None)
concurrency_limitsConcurrencyLimitsSee aboveLimits for this agent’s sub-agents
allow_nested_spawnbooltrueWhether sub-agents can spawn further sub-agents
max_budget_per_agentOption<u64>NoneCap on tokens for any single sub-agent
default_budgetOption<u64>Some(50_000)Default token budget when not specified
inherit_system_promptbooltrueWhether spawned agents inherit the parent’s tool usage instructions
enable_commsboolfalseWhether to enable parent-child communication

Budget allocation

For agent_spawn, the budget is resolved as:
  1. If budget.max_tokens is specified, use it (capped by max_budget_per_agent if set).
  2. Otherwise, use default_budget (50,000 tokens by default).

Model allowlists

Sub-agents validate requested models against a per-provider allowlist. The allowlist comes from either:
  1. Resolved policy from the global SubAgentsConfig in the configuration file (highest precedence).
  2. Default allowlist from SubAgentsConfig::default() (hardcoded fallback).
The agent_spawn tool definition dynamically includes the allowed models in the model parameter description, so the LLM can see which models are available.

Sub-agent execution lifecycle

1

Validation

The tool validates the prompt (non-empty), provider, model (against allowlist), and concurrency limits.
2

Client creation

An LlmClient is created via LlmClientFactory for the requested provider.
3

Session setup

  • agent_spawn: Creates a new session with optional system prompt + user prompt.
  • agent_fork: Clones the parent session and appends the fork prompt.
4

Registration

The sub-agent is registered with the SubAgentManager before the task is spawned.
5

Background execution

A tokio task runs the agent loop (agent.run_pending() or agent.run_host_mode() for host mode).
6

Completion

On success, manager.complete() stores the result. On failure, manager.fail() stores the error. Both notify listeners via a watch channel.
7

Result collection

Completed results are stored in a bounded deque (MAX_COMPLETED_AGENTS = 256) and can be queried via agent_status or agent_list.

Comms integration

When the comms feature is enabled and the parent has comms configured:
  1. Sub-agents get a comms context injected into their prompt explaining how to communicate with the parent.
  2. The sub-agent’s tools are wrapped with comms tools via wrap_with_comms.
  3. The sub-agent is added to the parent’s trusted peers so bidirectional communication works.
  4. The parent receives comms instructions in the spawn response explaining how to message the child.
  5. Communication uses in-process (CommsBootstrap::for_child_inproc) rather than network listeners.
See the comms guide for details on the inter-agent communication system.

How sub-agents get wired

When enable_subagents is true, the agent factory adds the five sub-agent tools (agent_spawn, agent_fork, agent_status, agent_cancel, agent_list) to the parent’s tool set. Child agents receive a copy of the parent’s tools but with sub-agent and comms tools removed to prevent recursive spawning.

Best practices

When to use sub-agents

  • Parallel independent tasks (e.g., analyze multiple files simultaneously).
  • Tasks requiring different model strengths (e.g., GPT for coding, Gemini for analysis).
  • Long-running work you want to delegate while doing other things.
  • Breaking complex problems into specialized subtasks.

When NOT to use sub-agents

  • Simple tasks you can do directly.
  • Tasks requiring tight coordination (use sequential steps instead).
  • When you need immediate results (sub-agents add latency).
Do NOT poll agent_status repeatedly — this wastes tokens. Use agent_list to see all sub-agents at once (more efficient than individual status checks). Do other useful work, then check status once or twice near the end.

See also