//pragmatic leaders

RAG Types

Reading time
7 min
Section
section A-resources
7 min left0%
rag types0%
7 min left
Retrieval-Augmented Generation systems combine the strengths of information retrieval and generative AI to produce accurate, context-aware outputs. Choosing the right RAG type depends on your product’s complexity, accuracy needs, and user scenarios.
Talvinder Singh, from a Pragmatic Leaders AI Product Leadership cohort, 2024

Retrieval-Augmented Generation (RAG) is a powerful approach that combines retrieval of relevant documents with generative AI to produce precise and context-aware answers. The RAG landscape has rapidly evolved — today, there are many RAG architectures, each suited to different problems, data types, and business priorities.

The actual job is to pick the right RAG type for your product’s data complexity, user expectations, and infrastructure constraints. Picking the wrong architecture wastes engineering effort, slows down time to market, or leads to poor user experiences.

This lesson breaks down the main RAG types, their strengths and weaknesses, and when to use each.

The simplest RAG: Naive (Simple) RAG

The Naive or Simple RAG is the most basic form. It retrieves relevant documents from a static knowledge base and generates answers based on those documents.

  • Use cases: FAQ bots, customer support for a fixed knowledge base, or simple summarization.
  • When to use: When you value speed, simplicity, and ease of deployment over advanced accuracy or flexibility. Proof-of-concept apps or small-scale deployments fit here.

This is the starting point for many Indian startups building AI-powered chatbots for internal knowledge or product FAQs.

// scene:

Early-stage startup in Bangalore building a customer support chatbot

PM: “We have a static FAQ document. Let’s build a simple RAG that retrieves relevant answers and generates responses.”

Engineer: “This will be fast to build and easy to maintain. No need for complex pipelines yet.”

They chose simplicity to validate user demand before investing in complex architectures.

// tension:

Balancing speed of delivery with accuracy needs

Adding context with Simple RAG + Memory

Adding memory enables the system to retain information from previous conversations, making it context-aware.

  • Use cases: Customer service chatbots that remember user history, personalized recommendation engines.
  • When to use: For ongoing interactions where context from earlier queries improves relevance and user experience.

Indian enterprises building chatbots for banking or insurance often use this to tie conversations to customer profiles across sessions.

Specialized queries need Branched RAG

Branched RAG dynamically selects the most relevant data source for each query instead of searching all sources.

  • Use cases: Legal research tools, multidisciplinary knowledge assistants.
  • When to use: When queries require specialized knowledge from different data silos, improving efficiency and relevance.

For example, a legal tech startup in Mumbai might branch queries between contract law, labor regulations, and tax codes.

HyDe: Hypothetical Document Embedding for complex queries

HyDe generates a hypothetical “ideal” document embedding for a query, then retrieves real documents similar to this embedding.

  • Use cases: Research and development, creative content generation.
  • When to use: For vague or complex queries where standard retrieval may not suffice, or when creative synthesis is needed.

This helps when user queries are ambiguous or exploratory, common in R&D labs or creative agencies.

Corrective RAG (CRAG) for high-stakes accuracy

Corrective RAG adds a scoring and filtering step to refine retrieved documents, ensuring only the most relevant information is used.

  • Use cases: High-stakes question-answering, compliance, legal document review.
  • When to use: When accuracy is critical and irrelevant or incorrect retrievals must be minimized.

Indian fintechs and healthcare startups handling sensitive data benefit from this to avoid costly errors.

Modular RAG for scalability and flexibility

Modular RAG separates retrieval and generation into modular, swappable components.

  • Use cases: Large enterprise systems, platforms needing easy customization or upgrades.
  • When to use: When you need to optimize, debug, or scale individual components independently.

Enterprises like Razorpay or Flipkart building AI platforms may adopt modular RAG to maintain flexibility as their data and models evolve.

Advanced RAG for real-time, production-grade applications

Advanced RAG incorporates re-ranking, fine-tuning, feedback loops, and dynamic retrieval.

  • Use cases: Real-time customer support, personalized learning, production-grade apps.
  • When to use: For complex, real-world tasks requiring high accuracy, adaptability, and performance.

Swiggy’s AI-driven customer support or Meesho’s personalized recommendations might leverage advanced RAG pipelines.

Other specialized RAG types

RAG TypeDescriptionIndian Context Example
GraphRAGUses knowledge graphs for structured retrievalScientific research at IISc
LongRAGHandles long documents or large context windowsLegal document analysis in Mumbai
Self-RAGRetrieves from its own outputs for iterative refinementAI assistants improving answers
EfficientRAGFocuses on computational efficiencyEdge deployments in low-resource settings
Golden RetrieverPrioritizes high recall to avoid missing relevant infoCompliance teams in banking
Adaptive RAGDynamically adjusts retrieval based on query or feedbackPersonalized tutoring platforms
RankRAGUses advanced ranking to prioritize resultsSearch engines like ShareChat
Multi-Head RAGUses multiple retrieval strategies in parallelMultimodal assistants in healthcare

Summary Table of RAG Types

RAG TypeBest ForExample Use Case
Naive/SimpleSimplicity, speedFAQ bots, small KBs
Simple w/ MemoryContextual conversationsCustomer service chatbots
BranchedSpecialized sourcesLegal research
HyDeVague/complex queriesR&D, creative writing
Corrective (CRAG)High accuracyCompliance, legal review
ModularScalability, flexibilityEnterprise platforms
AdvancedComplex, real-time, accurateProduction-grade apps
GraphStructured, relationship-aware retrievalScientific research
LongRAGLong documentsLegal, academic analysis
Self-RAGIterative/self-improvingProblem-solving agents
EfficientRAGLow resource/costEdge/mobile deployments
Golden RetrieverHigh recallE-discovery, literature review
AdaptiveDynamic, personalized needsAdaptive tutoring
RankRAGTop-quality resultsSearch engines
Multi-HeadMulti-domain/modality queriesMultimodal assistants

How to choose the right RAG type for your product

  • For simple, static knowledge bases, use Naive/Simple RAG.
  • For ongoing conversations needing context, use Simple RAG with Memory.
  • For specialized or multi-domain queries, use Branched or Multi-Head RAG.
  • For long documents or large context windows, use LongRAG.
  • For high accuracy or compliance requirements, use Corrective RAG or Golden Retriever RAG.
  • For scalable, flexible systems, use Modular RAG.
  • For adaptive, personalized experiences, use Adaptive RAG.
  • For complex, real-time, production-grade applications, use Advanced RAG.

Your choice depends on your product’s complexity, accuracy needs, scalability, resource constraints, and the nature of your data and queries.

// thread: #product-ai — Discussion on selecting RAG architecture for an Indian fintech
Priya (PM)Our customer support bot needs to handle multiple product lines with different knowledge bases.
Rahul (Engineer)Branched RAG fits well here — we query the right database per product.
Meera (Data Scientist)We should also consider Corrective RAG to filter irrelevant documents for regulatory compliance.
Priya (PM)Let’s prototype with Branched RAG and layer in corrective filtering as we mature.

The Indian context: cost, data quality, and talent

Three realities shape RAG implementation in India:

  • Cost sensitivity: Indian startups cannot afford large-scale compute costs. EfficientRAG or hybrid approaches often make more sense than heavy fine-tuning or custom models.
  • Messy data: Enterprises have inconsistent, multilingual, and incomplete data. Preprocessing and data cleaning become first-class concerns.
  • Talent scarcity: While ML talent is growing, building and maintaining complex RAG pipelines demands small, sharp teams who understand foundation models and retrieval deeply.

Indian companies like Razorpay and Postman focus on modular, API-driven RAG pipelines that balance cost and performance.

Field Exercise: Map your product to a RAG type (20 min)

Pick your current or target AI product. For each of these questions, write a short answer:

  1. What is your core user problem the RAG system should solve?
  2. What is the nature of your data? Static or dynamic? Single or multiple sources? Short or long documents?
  3. What is your accuracy requirement? Is a small error rate acceptable or do you need near-perfect precision?
  4. What are your infrastructure constraints? Can you afford complex pipelines or do you need lightweight solutions?
  5. How important is context from previous interactions?
  6. Do your queries span multiple domains or specializations?
  7. What is your expected user volume and latency requirement?

Use your answers to pick one or two RAG types from the summary table above that best fit your product.

Test yourself: Choosing the right RAG for your startup

// learn the judgment

You are the PM at a Series A Indian legaltech startup building a research assistant for lawyers. The product must handle queries across contract law, intellectual property, and labor law. Your knowledge base is updated weekly with new regulations. Accuracy is critical due to legal risks. You have a small engineering team and limited budget.

The call: Which RAG architecture(s) would you recommend and why? How would you balance accuracy, complexity, and cost?

Your reasoning:

// practice

You are the PM at a Series A Indian legaltech startup building a research assistant for lawyers. The product must handle queries across contract law, intellectual property, and labor law. Your knowledge base is updated weekly with new regulations. Accuracy is critical due to legal risks. You have a small engineering team and limited budget.

Your task: Which RAG architecture(s) would you recommend and why? How would you balance accuracy, complexity, and cost?

your reasoning:

0 chars (min 80)

From the field: Talvinder on RAG adoption in Indian startups

Where to go next

PL alumni now work at Flipkart, Razorpay, Swiggy, Meesho, PhonePe, Amazon, Microsoft, and 30+ other companies.