Capstone: The Autonomous Coding Agent
The portfolio anchor. An autonomous software-development agent that takes a GitHub issue, explores the codebase, implements a fix in a sandbox, runs the tests, and opens a PR gated on human approval — shipped with eval results, a cost analysis, and an honest limitations doc. Everything from Modules 1–7 converges here, then you turn it into interview narratives.
- ▸Architect an issue-to-PR coding agent with checkpointed plans and a sandboxed workspace
- ▸Implement codebase exploration that locates relevant files without stuffing the whole repo into context
- ▸Choose and apply an edit strategy (search/replace vs. full-file) and defend the trade-off
- ▸Run a test-driven repair loop with bounded retries: reproduce (red) → fix → verify (green)
- ▸Gate PR creation behind human-in-the-loop approval showing diff, test results, and cost
- ▸Assemble a small SWE-bench-style local eval set and report success rate, a partial-success taxonomy, and cost/time per issue
- ▸Write a frank limitations doc that scopes the agent honestly
- ▸Turn the capstone into system-design answers and STAR behavioral stories at a senior bar
Lessons
Best external resources
Curated reading, docs, and tools that pair with this module.
Real interview processes and take-home assignments from 50+ companies.
RepoCross-check your quiz mastery against an external bank.
GuideThe benchmark your capstone is a miniature of — mine it for eval-design ideas.
Benchmark65% on SWE-bench verified in ~100 lines of Python. Read every line before building your capstone — it's the existence proof that simple works.
RepoThe research-grade version: agent-computer interface design, trajectories to study.
Repo