pgvector

pgvector is the popular PostgreSQL extension that adds a vector column type, distance operators, and approximate nearest-neighbour indexes for similarity search over embeddings. It turns any PostgreSQL database into a vector database, removing the operational overhead of running a separate ANN store for many RAG and recommendation workloads.

What it provides

  • vector(N) column type. Stores fixed-dimensional float vectors.
  • Distance operators. L2 distance (<->), inner product (<#>), cosine distance (<=>).
  • Indexes. IVFFlat (inverted file with flat lists) and HNSW (hierarchical navigable small world); both approximate, both order-of-magnitude faster than brute force.
  • Filtered search. Combine vector similarity with standard SQL WHERE clauses (filter by tenant, date, category).

Why teams use it

  • Reuses existing PostgreSQL operations, backups, monitoring, and access control.
  • Co-locates vectors with structured data; no separate sync layer needed.
  • Standard SQL; familiar to existing engineers.
  • Available on every managed PostgreSQL platform (RDS, Cloud SQL, Aurora, Supabase, Neon, Crunchy, Azure Database).

Tradeoffs versus dedicated vector databases

pgvector wins on operational simplicity and filtered queries; dedicated systems (Pinecone, Weaviate, Qdrant, Milvus) tend to scale further on pure vector workloads and offer more index choices and tuning knobs.

🔗

Subscribe to Sahil's Playbook

Clear thinking on product, engineering, and building at scale. No noise. One email when there's something worth sharing.
[email protected]
Subscribe
Mastodon