RAG Engineering Mastery3 / 10

Embeddings & Vector Stores 101

An embedding turns meaning into geometry. A vector store makes that geometry searchable in milliseconds. Get both right and retrieval gets easy.

Published May 7, 20261 min readHaythem Rehouma · Claude Mastery

An embedding maps text to a point in high-dimensional space where closeness means similar meaning. Retrieval is then just "find the nearest points to this question." Everything else is plumbing.

Choosing a model

Quality vs. cost — bigger models embed nuance better but cost more per token and per query.
Dimensions — more dimensions can capture more, but cost storage and search time. Many production systems sit at 768–1536.
Consistency — embed your documents and your queries with the same model. Mixing models scrambles the geometry.

Where to store them

pgvector (Postgres) — if you already run Postgres, start here. One database, transactional, filterable by metadata with plain SQL.
Dedicated vector DBs — reach for them at large scale or when you need specialized index features. Don't start here for a first product.

Indexes keep it fast

Exact nearest-neighbour search is O(n) — fine at 10k vectors, painful at 10M. Approximate nearest-neighbour (ANN) indexes (HNSW, IVPFlat) trade a sliver of recall for orders-of-magnitude speed.

Vectors alone miss exact terms and rare keywords. Next: combining them with keyword search — hybrid retrieval.

Choosing a model

Where to store them

Indexes keep it fast

Related Claude skills you can install

Share this article

Series — RAG Engineering Mastery

Keep learning

database

The Claude Mastery course