Module 1: LLM API Mastery · Lesson 4 of 5 · 20 min
Structured Outputs & JSON Schema
When you need data, not prose: forcing model output to conform to a schema, and what to do when it doesn't.
Half of real-world LLM use isn't chat — it's extraction and classification: pull fields from an email, route a ticket, score a document. Downstream code needs types, not vibes. There are three levels of rigor for getting JSON out of a model.
| Approach | How | Guarantee |
|---|---|---|
| Prompt & pray | "Respond only with JSON…" | None. Fine for prototypes only. |
| JSON mode | response_format: {type: "json_object"} (OpenAI) | Syntactically valid JSON — but any shape. |
| Schema-enforced | OpenAI structured outputs (json_schema, strict: true); or a forced tool call whose input_schema is your output schema | Conforms to your schema via constrained decoding. |
# Define your desired OUTPUT as a tool schema, then force the model to "call" it.
resp = client.messages.create(
model="claude-sonnet-4-5", max_tokens=1024,
tools=[{
"name": "record_ticket",
"description": "Record the classified support ticket.",
"input_schema": {
"type": "object",
"properties": {
"category": {"type": "string",
"enum": ["billing", "bug", "feature_request", "other"]},
"severity": {"type": "integer", "minimum": 1, "maximum": 5},
"summary": {"type": "string"},
},
"required": ["category", "severity", "summary"],
},
}],
tool_choice={"type": "tool", "name": "record_ticket"}, # MUST call it
messages=[{"role": "user", "content": f"Classify this ticket: {ticket_text}"}],
)
data = next(b for b in resp.content if b.type == "tool_use").input
# data is a dict matching the schema — no parsing prosetool_choice forces the call, so the 'tool' is really just an output mold. This is the standard Anthropic pattern for structured extraction. On OpenAI, prefer native structured outputs with strict: true, which constrains decoding to the schema.Validate anyway — and repair
from pydantic import BaseModel, ValidationError, conint
class Ticket(BaseModel):
category: str
severity: conint(ge=1, le=5)
summary: str
def extract(text: str, max_retries: int = 2) -> Ticket:
prompt = f"Classify this ticket: {text}"
for attempt in range(max_retries + 1):
raw = call_model(prompt) # your API call
try:
return Ticket.model_validate(raw)
except ValidationError as e:
# feed the error BACK to the model — it usually self-corrects
prompt = (f"Classify this ticket: {text}\n"
f"Your previous output failed validation:\n{e}\n"
f"Return corrected JSON only.")
raise RuntimeError("extraction failed after retries")Order of mitigations for malformed output: (1) validate and retry with the error message included — cheapest fix, works most of the time; (2) tighten the schema/prompt (enums,
strict mode, lower temperature); (3) fall back to a stronger model or a deterministic parser. Never silently json.loads and hope.✦ Tip
Tool calling vs. structured output — when to use which? Tool = the model needs information or effects mid-task (search, DB query), possibly several times. Structured output = you need the final answer in a shape (classification, extraction). If there's no action to perform, don't dress extraction up as an agent loop — force one schema'd output and be done.
Key takeaways
- ▸JSON mode guarantees syntax; schema enforcement (strict structured outputs / forced tool call) guarantees shape.
- ▸Constrain aggressively: enums, min/max,
required— every constraint removes a failure mode. - ▸Validate with Pydantic/Zod even when 'guaranteed'; retry with the validation error fed back.
- ▸Extraction ≠ agent. No action needed → one forced structured call.