By Sahil Kapoor - 17 Jun 2025

Chunking

Chunking is the process of splitting documents into smaller passages before embedding them for retrieval. Chunk size and boundaries directly determine what a retrieval system can find: a chunk that is too large blurs the meaning of its embedding, and a chunk that is too small lacks the context to answer most questions.

Common strategies

Fixed-size character or token splitting. Cuts every N characters or tokens. Simple but ignores semantic boundaries.
Recursive character splitting. Tries to split on paragraph, then sentence, then word boundaries. The common baseline in LangChain and LlamaIndex.
Structural chunking. Splits on headings, sections, code blocks, or table rows. Often suited to technical documentation.
Semantic chunking. Splits on shifts in embedding similarity between adjacent sentences.
Overlap. Adjacent chunks share a small tail and head (10 to 20 percent) so context isn't lost at boundaries.

🔗

Related Terms
RAG, Embeddings, Vector Database, Reranker, Context Window.

Common strategies

Subscribe to Sahil's Playbook