MULTCourse

Multi-agent systems

Lessons8modules
Total80mfull study
Quick7mtrailer
Projects8docker labs
CHEATSHEET · 01Multi-agent · master cheatsheet
When to split
  • ·Tool count > ~12 with > 10% mis-selection rate
  • ·Different cost/latency budgets per sub-task
  • ·Different trust boundaries (internet vs internal data)
  • ·Different model strengths needed (cheap classifier + expensive synth)
  • ·Eval lift > added cost — measure first, split second
The 7 patterns
  • ·Router — one classifier, one specialist per turn
  • ·Supervisor / Manager-Worker — central orchestrator delegates + aggregates
  • ·Hierarchical — supervisors of supervisors (50+ agents only)
  • ·Pipeline — fixed DAG, predictable, rigid
  • ·Hand-off / Swarm — one active agent, returns next agent
  • ·Debate / Consensus — N propose, judge picks (or vote)
  • ·Blackboard — Redis scratchpad, agents read/write
  • ·Graph (LangGraph) — explicit nodes/edges + conditional cycles
Communication contracts
  • ·Pydantic-typed messages between every pair of agents
  • ·Per-handoff summary; archive raw transcripts
  • ·MCP for tools (the 2026 standard)
  • ·A2A for cross-vendor agent calls (signed Agent Cards)
  • ·No free text between agents — that's a smell
Production guardrails
  • ·Cap recursion depth (typically 5-8)
  • ·Per-agent token budget + global trace budget
  • ·Sandbox every tool call (Agents SDK / harness / Docker)
  • ·Per-agent auth, scoped tool tokens
  • ·Persist checkpoints — replay is your debugger
  • ·Allow-list handoffs, signed Agent Cards (A2A 0.3)
  • ·Detect cross-agent prompt injection at every boundary
CHEATSHEET · 02Framework picks · 2026
Default for production
  • ·LangGraph 1.0 (Oct 2025 GA, current 1.0.x). Use for: stateful supervisor teams, durable execution, audit-grade traces.
  • ·OpenAI Agents SDK. Use for: TS/JS shops, fastest path from Swarm patterns, hand-offs + sandboxing built in.
  • ·Anthropic Claude Agent SDK. Use for: long-horizon harnesses, sub-agent spawning, Claude-only stacks.
Default for fast crews
  • ·CrewAI Flows. Use for: role-based teams, biz-process automation, fastest time-to-first-agent (~30 min).
  • ·DocuSign report: ~14× less code than LangGraph for equivalent flows (vendor-published, take with salt).
  • ·Watch out: ~18% token overhead per published benchmarks.
Default for memory-first or typed
  • ·Letta — OS-style tiered memory (core/recall/archival), persistent agents.
  • ·Pydantic AI — type-safe end-to-end, Logfire built in, durable execution.
  • ·smolagents — code agents (HF), tiny core, great for research-y workflows.
Self-hosted backbone
  • ·gpt-oss-120b / gpt-oss-20b (Apache 2.0, Aug 2025). 20b runs on 16GB edge.
  • ·Mistral La Plateforme Agents API — EU sovereignty + native MCP.
  • ·Ollama in Docker — the standard local workstation runner.
Avoid / migrate
  • ·OpenAI Swarm — deprecated March 2025; migrate to Agents SDK.
  • ·AutoGen 0.4 — Microsoft moved it to maintenance. New: MS Agent Framework (RC 1.0). Original creators: AG2 fork.
  • ·Magentic-One direct — now exposed as an orchestration primitive in MS Agent Framework.
Observability stack
  • ·Langfuse (acquired by ClickHouse, Jan 2026) — multi-agent conversation views, OSS, self-host friendly.
  • ·LangSmith — best with LangGraph.
  • ·Arize Phoenix — OpenInference / OpenTelemetry-native, deep eval.
  • ·W&B Weave — preserves parent-child agent traces.
  • ·Helicone — gateway-level cost attribution.