Retrieval-augmented generation (RAG)

Back to glossary

Retrieval-augmented generation is an AI technique that retrieves relevant documents at query time and feeds them into the language model's context, so the model answers from sources rather than from parametric memory alone.

Why it matters

Vanilla LLMs hallucinate confidently when they don't know something. RAG addresses this by grounding generation in retrieved documents — the model can quote, cite, and constrain itself to source material. It's the dominant pattern for enterprise AI deployments and the underlying mechanic in tools like Perplexity, NotebookLM, and Pith's wiki engine.

RAG quality depends on retrieval quality. Embedding-based retrieval (vector search), keyword retrieval (BM25), and hybrid approaches each have trade-offs. Recent work emphasises **citation grounding** (each generated claim attributed to a specific source span) over pure retrieval, since users need to verify.

How Pith relates

Pith's wiki and search use RAG: bookmarks are embedded and indexed, retrieved by relevance, and synthesised into wiki pages with citations back to the sources. Every wiki paragraph links to the bookmark that produced it.

Why it matters

How Pith relates

See also