TL;DR the 30-second version
Vector-only RAG returns flat, context-poor chunks ranked purely by similarity. Modeling entities and relationships in a graph lets you traverse why things connect — which grounds LLM answers in structure, not just nearness. This post sets up the series.
Retrieval-augmented generation usually starts the same way: chunk your documents, embed the chunks, and at query time pull back the top-k nearest neighbours. It works, until it doesn’t.
Where similarity runs out
Embeddings are great at “find me text that looks like this.” They are poor at “find me the thing that caused this,” or “walk from this entity to everything it depends on.” Those are relationship questions, and relationships are exactly what a flat vector index throws away.
What a graph adds
A knowledge graph stores entities as nodes and relationships as typed edges. Retrieval becomes a traversal: start from the entities mentioned in the query, then expand along edges that matter.
// Pseudo-retrieval: seed from semantic match, expand by structure
const seeds = await vectorSearch(query, { k: 5 });
const subgraph = await graph.expand(seeds, {
hops: 2,
edgeTypes: ["DEPENDS_ON", "CAUSED_BY", "PART_OF"],
});
return rank(subgraph, query); // hybrid: semantic + structural
The result is context the model can actually reason over: not ten loose snippets, but a connected neighbourhood with explicit relationships.
What’s next in this series
We’ll build this up across the series — modeling entities in Neo4j, adding episodic time-aware memory with Graphiti, and tuning hybrid retrieval so it stays fast and cost-aware.