Module quiz

12 questions · pass bar 80% · retry as often as you like — your best score counts toward the gate.

1.Why must you resend the full message history on every API call?
2.In one tool-call round trip with the Anthropic API, what is the correct message sequence after the model returns stop_reason: "tool_use"?
3.A 10-turn conversation averages 500 tokens per turn. Roughly how many input tokens does the API call at turn 10 consume, and why?
4.What is the difference between tool calling and structured output, and when should you use a JSON-schema output instead of a tool?
5.The model returns malformed JSON for a tool call. What's the right first mitigation?
6.What do temperature and top_p do, and what suits a tool-calling agent?
7.How do API rate limits typically work, and what's a correct backoff implementation?
8.Which error should you NEVER automatically retry?
9.What is prompt caching and when does it cut agent costs dramatically?
10.Why should tool descriptions be written as carefully as prompts?
11.What happens if you send a tool_result whose tool_use_id doesn't match a tool call from the immediately preceding assistant message?
12.Your streaming agent shows nothing for 8 seconds, then dumps the full answer. What's the most likely bug?