The async Python runtime for production agents

Agents that survivefailure, persisteverything.

Durable agent runs that survive crashes, pause for humans, and resume in any client your users work in. Every call recorded as evidence.

Install dendrux →Star on GitHub See a live run

$pip install"dendrux[all]"

Ten lines

hello_agent.py

import asyncio
import os
from dendrux import Agent, tool
from dendrux.llm.anthropic import AnthropicProvider

@tool()
async def add(a: int, b: int) -> int:
    """Add two numbers."""
    return a + b

async def main():
    async with Agent(
        provider=AnthropicProvider(
            model="claude-sonnet-4-6",
            api_key=os.environ["ANTHROPIC_API_KEY"],
        ),
        prompt="You are a calculator.",
        tools=[add],
    ) as agent:
        result = await agent.run("What is 15 + 27?")
        print(result.answer)

asyncio.run(main())

Live run

run_01JR5M9K...spreadsheet_analysisWAITING_CLIENT_TOOL

A live run

Every call, every pause, every resume. Recorded.

Click a node to inspect. The timeline is the dashboard. No JSON walls, no decoration.

run_01JR5M9K...spreadsheet_analysisWAITING_CLIENT_TOOL

Six pillars

Six commitments the runtime makes.

Every feature in dendrux exists to serve one of these. No framework magic, no hidden loops.

Survive failure

Durable writes with retry. Stale runs swept. Crashed runs retry with prior context. Runs never lie about state.

Control execution

Tool constraints, timeouts, parallel/sequential policy, delegation depth guards. No runaways.

Govern behavior

Tool deny, HITL approval, advisory budgets, PII redaction, secret detection. Four layers of runtime governance.

Explain everything

Every LLM call, tool execution, pause, and lifecycle event is persisted as evidence. Fail-closed recorder + best-effort notifier.

Coordinate agents

Parent-child delegation with automatic linking via contextvars. Depth guards and lifecycle coupling.

Pause for the real world

Client-side tool pause/resume for spreadsheets, browsers, and desktops. Domain-aware constraints.

The pause moment

Agents cross the server/client boundary.

When an agent calls a client-side tool, the run enters WAITING_CLIENT_TOOL. State is persisted. The client SDK executes on the user's device. When results return, the run resumes from exactly where it left off, with the same context and reasoning.

PENDING → RUNNING → WAITING_CLIENT_TOOL → RUNNING → SUCCESS

phase 1/4 · running

What's in the box

Production essentials, opt-in.

Governance

Four layers, all kwargs.

Tool deny. Human approval. Advisory budgets. Content guardrails with PII redaction and secret detection. Stack them or don't.

deny=require_approval=budget=guardrails=

Streaming

Token-by-token, clean cancel.

Stream text, tool calls, and lifecycle events. Break early and the run cancels cleanly. Works with every provider.

async for chunk in agent.stream(q).text():
print(chunk, end="")

Delegation

Agents as tools.

Parent-child run linking via contextvars. Zero developer code. Dashboard shows the full tree.

Persistence

SQLite to Postgres.

Zero config for dev. One env var for prod. Alembic migrations included.

MCP

Tool source, not rival.

MCP servers become dendrux tools automatically. Schema translated, rate-limited, traced.

Providers

Swap one import. Everything else stays.

Three first-class provider classes. Any OpenAI-compatible server works through OpenAIProvider with a custom base_url.

First-class

AnthropicOpenAI ChatOpenAI Responses

OpenAI-compatible · via OpenAIProvider(base_url=…)

vLLMSGLangGroqTogetherOllama

Start with ten lines.
Scale without rewrites.

Dendrux is in active development at v0.2.0a1. Core API stabilizing. Apache 2.0. Bring your own infra.

pip install dendrux →Read the docs