RAGCourse

RAG, vector DBs & enterprise search

Lessons10modules
Total105mfull study
Quick7mtrailer
Projects8docker labs

Skills you'll gain

12
  • Pick the right embedding model from MTEB + cost + privacyWorking

    Choose between text-embedding-3-large (Matryoshka), voyage-3-large, Cohere embed-v4, BGE-M3, Nomic-embed-text-v2-MoE based on language, modality, deployment and budget. Defend the pick with MTEB scores.

  • Choose & justify a chunking strategy by data shapeProduction

    Recursive vs semantic vs late-chunking vs Anthropic contextual retrieval vs parent-document. Code-aware and table-aware splitters for source code and structured docs.

  • Choose & justify a vector store from 8 production optionsProduction

    pgvector, Qdrant, Weaviate, Milvus, Pinecone, LanceDB, Vespa, Mongo Atlas — pick by scale, ops model, hybrid support, payload filtering, multi-tenancy. Read VectorDBBench results.

  • Implement hybrid search with RRF on at least two storesProduction

    BM25 + dense vector + RRF (k≈60). Score fusion vs rank fusion. Native hybrid in Qdrant/Weaviate/Pinecone vs hand-rolled with pgvector + tsvector.

  • Add query rewriting (HyDE, multi-query, decomposition)Working

    Choose rewriting strategy from query shape — short/under-specified → HyDE; ambiguous → multi-query; complex compound → decomposition + step-back. Measure lift on Recall@K.

  • Re-rank with Cohere/Jina/BGE/ColPaliProduction

    Cross-encoder rerank for surgical context. Open-source vs API. ColPali / ColBERT late-interaction for multi-vector retrieval. Trim 50 → 8 to cut token cost AND raise faithfulness.

  • Eval RAG with Ragas (faithfulness, answer relevance, context P/R)Production

    Build a labelled Q→chunk → answer dataset. Run Ragas + TruLens RAG triad. Gate CI on Recall@5, Faithfulness, p95 latency, $/query.

  • Ship Agentic RAG (corrective + self-RAG)Advanced

    LangGraph 1.0 state machine: retrieve → grade → rewrite/re-retrieve → generate → self-check → loop. Knows when to web-search, when to refuse, when to ask for clarification.

  • Index visually-rich PDFs with ColPali (no OCR)Advanced

    Multimodal RAG over slides, scanned PDFs, screenshots using ColPali multi-vector embeddings — beats OCR + text retrieval on layout-heavy corpora.

  • Observe + budget RAG in Langfuse / PhoenixProduction

    Trace every stage of a RAG call (embed → search → rerank → LLM). Per-stage latency + token attribution, $/query dashboards, cache-hit rate, negative-answer-rate.

  • Embedding-model migration with versioning + blue/green indexAdvanced

    Tag (model_id, dim, chunker_version) on every row. Run shadow-index re-embed, dual-read with feature flag, A/B eval before cutover. Never re-embed in place.

  • Defend retrieval-time prompt injection + PIIAdvanced

    Detect instructions hidden in retrieved chunks (OWASP LLM01). PII redaction at ingest, retrieval-time policy filters per tenant, signed retrieval audit trail.