RAG Pipeline

15 building blocks and models in the rag pipeline category.

Document Loader

Load documents from various sources (PDF, DOCX, web)

Split documents into chunks (recursive, semantic)

Generate embeddings from text

Store and retrieve vector embeddings

Vector similarity search

Combine keyword and semantic search

Re-rank retrieved results (Cohere, BGE-reranker)

Assemble context for LLM prompt

Best Match 25 algorithm for term-frequency based document retrieval

Term-based retrieval using inverted indexes (TF-IDF, BM25)

Neural sparse retrieval with learned term expansion (SPLADE, DeepImpact)

Late interaction retrieval with per-token matching via MaxSim

ML-based ranking combining multiple relevance signals (LambdaMART, RankNet)

Self-reflective RAG with adaptive retrieval and critique tokens

Language model that generates answers from retrieved context in RAG