Module 4 · Phase 2: Knowledge & state · Weeks 9–11

Memory & Context Engineering

"How would you design agent memory?" is now a standard senior interview question. This module gives you a real implementation to talk about: context-window budgeting, compaction, a persistent memory store with disciplined write and read paths, contradiction resolution — and defenses against memory injection, where a prompt attack becomes a persistent compromise.

After this module you can

▸Treat the context window as a budgeted resource with an explicit allocation policy per call
▸Implement compaction that summarizes old turns without breaking tool-call pairing or losing task state
▸Explain the memory taxonomy — working, episodic, semantic, procedural — and map each to storage + recall
▸Build a write path: extract candidate facts, deduplicate, detect contradictions, store with provenance
▸Build a read path scoring relevance + recency + importance, injecting sparingly as delimited untrusted data
▸Describe a concrete memory-injection attack and implement layered defenses your own red-team test can't beat

Lessons

The Context Window Is a Budget

Context engineering is deciding what's in the window on each call: system prompt, recalled memories, summarized history, recent turns, tool results. Big windows made the problem subtler, not smaller — you're writing an allocator, not stuffing a suitcase.

Compaction: Summarizing Without Losing the Plot

Long sessions overflow any window. Truncation forgets; compaction summarizes the oldest turns into a dense digest while recent turns stay verbatim. The craft is in what must survive untouched — and in never splitting a tool_use from its tool_result.

The Memory Taxonomy & Persistent Stores

Compaction manages one session; the moment the process exits, everything is gone. Persistent memory means deciding what to keep across sessions — and the interview-standard taxonomy (working, episodic, semantic, procedural) tells you what to store where and how to get it back.

The Write Path & the Read Path

Between 'candidate fact' and 'stored fact' sits a gauntlet: dedupe, contradiction check, provenance gate. Between 'stored fact' and 'in the prompt' sits another: relevance + recency + importance scoring, with a stingy top-k. Both gauntlets exist because recalled junk is context poisoning.

Memory Injection & Context Poisoning Defenses

Prompt injection in a stateless agent is a one-shot problem — the session ends, the attack dies. Give the agent memory and injection becomes persistent: a poisoned 'fact' recalled into every future session is a standing backdoor. This lesson is why your write path is a security boundary.

12 questions · pass ≥ 80%

Lab: Persistent Memory for Your Lab 02 Agent

Give the Lab 02 agent long-term memory: a persistent store with provenance, a disciplined write path (extraction, dedupe, contradiction resolution), a scored read path injecting fenced memories at session start, threshold-triggered compaction — and a self-written red-team test proving the write path resists memory injection. Gate G2 ends with Claude attempting a novel injection against your defenses.

Best external resources

Curated reading, docs, and tools that pair with this module.

MemGPT paper (Letta)

Hierarchical memory — the design everyone cites in interviews.

Anthropic — Effective context engineering for AI agents

How production agents budget and structure their windows. The core reading for this module.

12-Factor Agents — Factor 3: Own your context window

The production-engineer's case for treating context as code you control.

Simon Willison — prompt injection series

The foundation for understanding memory injection. Required.