By Sahil Kapoor - 19 Jun 2025

Reranker

A reranker is a model that takes an initial set of retrieved candidates and re-orders them to improve precision. In a typical retrieval pipeline, a fast first-stage retriever returns 50 to 200 candidates, and a slower but more accurate reranker scores each (query, candidate) pair and returns the top 5 to 10.

How it works

Most rerankers are cross-encoders: a transformer model that takes the query and the candidate as a single concatenated input and outputs a relevance score. This differs from bi-encoders (used for embeddings), which encode query and document separately. Cross-encoding is more accurate because the model attends across the pair, but cannot be precomputed.

Common rerankers

Commercial APIs: Cohere Rerank, Voyage Rerank, Jina Reranker
Open source: BAAI bge-reranker, mxbai-rerank, MS MARCO MiniLM cross-encoders
LLM-as-reranker: Using an instruction-tuned LLM to score candidates with a prompt. Slower but flexible.

🔗

Related Terms
RAG, Embeddings, Vector Database.

How it works

Common rerankers

Subscribe to Sahil's Playbook