1.Why is chunking called the highest-leverage decision in a RAG pipeline?
2.Which two query types does dense (embedding) retrieval handle poorly while BM25 handles them well?
3.How does Reciprocal Rank Fusion (RRF) combine BM25 and dense results, and why ranks instead of scores?
4.Why are bi-encoders used for first-stage retrieval but cross-encoders for reranking?
5.Define precision@5, recall@5, and MRR.
6.Your recall@5 is high but final answers are frequently wrong. Where do you look first?
7.What's the difference between faithfulness and answer relevance, and how does LLM-as-judge measure faithfulness?
8.How do you build a retrieval eval set cheaply but credibly?
9.What is query decomposition, and when is single-shot RAG structurally unable to answer?
10.Fixed single-shot RAG pipeline vs. agentic retrieval-as-a-tool: what are the trade-offs?
11.Your RAG system confidently answers questions the corpus can't support. Name three layered mitigations.
12.How does HyDE improve retrieval, and what does it actually embed?