6-agent AI team that plans, builds, tests, and reviews your code

Solo developers constantly switch between planning, coding, testing, and reviewing — each context switch burns time and mental energy. There's no separation of concerns when one person wears every hat.
Without teammates, code goes from brain to production with zero review. Bugs, architectural missteps, and quality regressions slip through because there's nobody to catch them before merge.
Tasks start with rough descriptions and end with 'looks good enough'. No upfront acceptance criteria, no verifiable checklist, no structured review — just a developer's gut feeling that it's probably done.
Solo dev wearing planner/coder/tester/reviewer hats simultaneously
6 specialized AI agents handle each role in a structured pipeline
Code pushed without any review or quality gate
Inspector scores code on 7 dimensions + circuit breaker stops bad code after 3 rejections
Task descriptions with no acceptance criteria or completion checklist
Planner writes done_when checklist upfront; Ranger verifies every item passes
The pipeline models software development as a 7-column kanban board where 6 specialized AI agents each own a stage. Planner decomposes requirements into implementation plans with explicit done_when checklists. Critic reviews plans for feasibility, scoring clarity, done-when quality, and reversibility. Builder generates code following the approved plan. Shield writes tests covering edge cases the Builder might miss. Inspector performs structured code review across 7 dimensions (quality, error handling, type safety, security, performance, coverage, completion). Ranger runs lint, build, and the full test suite. A circuit breaker halts the pipeline after 3 consecutive rejections, preventing infinite rework loops. Every agent signs its output with nickname, model, and timestamp — the task card becomes the complete work log. Human approval gates at plan review and code review keep critical decisions under developer control.
BDD Pipeline Flow

Tasks flow through Request, Plan, Plan Review, Implement, Code Review, Test, and Done. The pipeline supports 3 risk levels: L1 Quick skips all reviews for trivial changes, L2 Standard includes code review, and L3 Full activates every gate including plan review and testing. Each level maps to the actual risk profile of the change.

Each agent has a fixed nickname and distinct responsibility: Planner decomposes requirements, Critic reviews plans, Builder writes code, Shield writes tests, Inspector scores implementations on 7 dimensions, and Ranger runs the full test suite. Model routing is configurable — Planner and Builder use high-reasoning models (Opus), while review agents use faster models (Sonnet).

Every agent signs its output with a signature header: nickname, model name, and timestamp. The agent_log field records a chronological JSON array of all agent actions. Plan, implementation notes, review comments, and test results are stored as structured data — the task card itself becomes the complete work log with full traceability.

A built-in circuit breaker monitors rejection counts at every review stage. After 3 consecutive rejections, the pipeline halts and escalates to the developer — preventing infinite rework loops and wasted computation. Human approval gates at plan review and code review ensure critical decisions stay under developer control, even in auto mode.
Planner, Critic, Builder, Shield, Inspector, Ranger
Tasks orchestrated across 10+ projects in production
3 risk levels with configurable review gates
Auto-stop after 3 consecutive rejections at any stage