Vector Database
A vector database is a storage system optimised for similarity search over high-dimensional vectors, typically embeddings. Given a query vector, it returns the closest vectors in the corpus using an approximate nearest-neighbour index, alongside any associated metadata.
How it works
Vector databases index embeddings using approximate nearest-neighbour (ANN) algorithms such as HNSW, IVF, ScaNN, or DiskANN. These trade a small amount of recall for orders-of-magnitude better latency than a brute-force scan. Most also support hybrid search, combining vector similarity with traditional keyword filters or BM25 scoring.
Common products
- Managed: Pinecone, Weaviate Cloud, Qdrant Cloud, Vespa Cloud, Turbopuffer
- Self-hosted: Weaviate, Qdrant, Milvus, Vespa, Marqo
- Embedded: Chroma, LanceDB, FAISS (library, not a database)
- Relational extensions: pgvector (PostgreSQL), MongoDB Atlas Vector Search, Redis Vector Search, Elasticsearch dense_vector
🔗