Quick Intro~7 MIN· AND

AI-native software development

Full Study

A scannable trailer of the 10-lesson course. Read top to bottom — no clicks needed.

INTROBLOCK · 01
AND · 7 MIN PREVIEW

Coding agents stopped being a demo. They ship your PRs now.

SWE-bench Verified top scores crossed 87.6% in April 2026 (Claude Opus 4.7 Adaptive). Stripe is merging 1,000+ AI PRs per week. Shopify reports a measured 20% productivity lift across engineering. The teams pulling away aren't the ones using more agents — they're the ones with the right loop, the right rails, and a $-per-merged-PR dashboard. This trailer shows the gap.

CONCEPTBLOCK · 02

The one-line difference

Autocomplete completes a token. An agent runs a loop: read → plan → edit → verify, and won't stop until the tests are green. The 'AI-native developer' isn't someone who types prompts faster — it's someone who has built rails (specs, hooks, MCP servers, eval gates) so the loop converges without hand-holding. If your only agent input is a Slack-style chat, you don't have an AI-native workflow. You have a chatbot pretending to be a teammate.
TIPAlways test the loop end-to-end on a real ticket before optimising any single stage. Most teams over-engineer the prompt and under-engineer the verify step.
WATCH OUTAuto-merge without a verify gate is the #1 cause of regression cascades. SWE-bench-Verified accuracy is *task* accuracy — your repo accuracy is whatever your verify step says it is.
GOTCHAAn agent with bash + write access and no recursion cap will rm -rf or burn $50 in a single run. The first rail you write is permission scope, not the prompt.
DIAGRAMBLOCK · 03

The agentic dev loop

failpassTICKETREADPLANEDITVERIFYMERGE
Read is the cheapest stage and where most teams under-invest. Verify is the expensive one and where the trust comes from. Loop until verify passes, then merge — never merge to chase the loop.
CODEBLOCK · 04

A 12-line agentic loop you can run today

BASH
1# Pick any agent CLI: claude, cursor-agent, aider, codex, cline.
2# This snippet works with claude-code; substitute as needed.
3
4set -euo pipefail
5TICKET="$1" # e.g. LIN-1287
6
7# 1. READ — let the agent gather context, no edits allowed yet.
8claude --permission-mode plan -p "Read the repo and summarise files relevant to $TICKET" > plan.md
9
10# 2. PLAN — produce a checklist before any edits.
11claude --permission-mode plan -p "Given plan.md, write a 5-step checklist to close $TICKET" > checklist.md
12
13# 3. EDIT — implement strictly against the checklist.
14claude --permission-mode acceptEdits -p "Implement checklist.md. Stop after the last step."
15
16# 4. VERIFY — run the project's test suite and lint.
17npm test && npm run lint
18
19# 5. If verify failed, re-enter the loop with the failure as the new ticket.
Line 8 / 11: read and plan use plan permission mode — no edits, no shell. Line 14: only EDIT uses acceptEdits. Line 17: verify is your own test suite. The loop is your contract — not the agent's promise.
CHEATSHEETBLOCK · 05

The 5 rules every 2026 AI-native shipper knows

01Read → plan → edit → verify. Skip a stage and you'll pay for it in regressions.
02Project rules live in AGENTS.md / CLAUDE.md. The prompt is the LAST place to put them.
03Tests are the agent's contract. No tests → no agent loop. (TDD gets WAY easier with agents.)
04Hooks > prompts for anything you want enforced every time (lint, type, secrets).
05Track $-per-merged-PR weekly. Below $5 is healthy; above $20 is an architectural problem.
MINIGAME · RAPIDFIRETFBLOCK · 06

Quick check — true or false?

Top SWE-bench Verified scores in April 2026 are above 85%.
CLAIM 1/5 · READY · scroll into view
CONCEPTBLOCK · 07

What you'll ship in the full study

Ten lessons. Six docker projects. By the end you'll have: — An `agent-dev-shell` repo with claude-code, cursor-agent, aider, codex, and cline pre-wired against the same fix-the-bug task — so YOU pick the right agent for the job by measuring. — A `tdd-agent-loop` container that ships failing-test → green-test cycles end-to-end, ready to drop into your team's monorepo. — A custom MCP server (`mcp-postgres-pair`) that hands your agent typed read-only Postgres access without leaking the connection string. — A `pr-review-bot` GitHub Actions pipeline: PR opens → agent reviews → agent fixes → tests pass → human approves. — An `agent-cost-dashboard` (Helicone + Slack) that surfaces $-per-merged-PR by team — the metric that ends 'should we use Cursor or Claude Code' debates. Every docker project is meant to be lifted into your real work — not a demo.
INCLUDEDEach project ships with composeYaml, expectedStdout, and a 'lift to work' note explaining how to drop it into your team's repo on day one.
LESSON COMPLETEBLOCK · 08

That's the trailer.

NEXTLesson 1 · The agentic dev loop
WHAT YOU'LL WALK AWAY WITH

Real skills, real career delta.

Skills you'll gain

10
  • Diagnose when to leave autocomplete for an agent loopWorking

    Use the 3-signal test (ticket length, cross-file edits, test surface) to pick autocomplete vs agent before opening the IDE — and quote the cost difference.

  • Author project rules (CLAUDE.md / AGENTS.md / cursor.rules)Production

    Write rules that survive 8-hour sessions: build commands, conventions, DO/DO-NOT lists, gotchas, hook config — distilled from your bug history, not copied from a template.

  • Run plan-then-edit reliably across agent CLIsWorking

    Use plan mode in Claude Code, plan-as-edit in Cursor, plan-files in Aider, PRD generation in Devin — recognise when to skip plan and when it's mandatory.

  • Drive a TDD-with-agent loop that convergesProduction

    Red-green-refactor in agent loop: failing test as input, agent edits until green, agent stops at green. Includes the 'one bug = one regression test' commit rule.

  • Dispatch parallel sub-agents for read-only researchProduction

    Spawn N independent Explore agents on non-overlapping queries; aggregate results back to a primary agent. Cost math: 4× tokens parallel often beats 4× sequential time at the team level.

  • Wire MCP servers (Linear, Sentry, Postgres, custom) into agentsWorking

    Install standard MCP servers, write a custom one in <30 lines with FastMCP, scope read-only vs read-write, pair it with a permission boundary.

  • Configure agent hooks for lint / type / secrets enforcementProduction

    Pre-tool-use, post-edit, post-stop hooks that catch unwanted edits BEFORE they ship. Hooks beat prompts because they don't depend on the model remembering.

  • Ship PR-review automation with auto-fix loopsProduction

    CodeRabbit / Greptile / self-hosted reviewer + agent that consumes the review, fixes, re-runs tests. The 'no human-in-loop until tests pass' pattern.

  • Track $-per-merged-PR and route work by itProduction

    Helicone / OpenAI dashboard / Anthropic Console + Slack digest of weekly $/PR by team. Use it to settle 'which agent / which model / which mode' debates with data.

  • Operate AI-native dev under SOC 2 / GDPR / regulated constraintsAdvanced

    Permission models, audit logs, secrets scanning preflight, sandboxed exec, allow-listed shell — the rollout shape regulated industries actually buy.

Career & income delta

Career moves
  • Title yourself credibly as 'AI-native engineer' or 'developer-experience engineer' — the 2026 hiring channel for senior IC roles at $200-380K base.
  • Lead the 'developer productivity / DX' team that's getting stood up at every series-B+ company — the AI-tooling org that didn't exist 18 months ago.
  • Pick up contracting / fractional CTO work at $200-400/hr — 'we have Cursor seats but the team isn't shipping faster' is the most common 2026 inquiry.
  • Own the 'we ship 5× faster than competitors' narrative on your perf review — backed by a $-per-merged-PR dashboard you actually built.
Income impact
  • $20-40K bump for senior ICs adding measurable agent-driven shipping velocity to their resume.
  • $50-100K bump moving from a generic backend role to a developer-experience or AI-platform team.
  • Freelance / consulting rates: $200-400/hr — the most common 2026 inquiry is 'help us roll out coding agents without breaking prod'.
  • Enterprise demos / sales-engineering: closing one 6-figure DX-tooling deal per quarter often requires the cost-dashboard + rollout ADR shape in this course.
Market resilience
  • Coding agents are now table stakes — competence here is the new 'git'. Durable across foundation-model market shifts.
  • MCP and A2A are now Linux Foundation standards; protocol fluency carries over every time the model du jour changes.
  • TDD-with-agents discipline transfers to whatever framework launches next — the loop shape is the durable part.
  • Cost / quality observability beats vibes-based purchasing decisions — and that skill is provider-agnostic by design.
  • Permission-model + audit-trail expertise (SOC 2 / GDPR / regulated industries) stays in demand regardless of which agent vendor wins the year.