0% found this document useful (0 votes)

672 views8 pages

RAG Technics

Uploaded by

tumikosha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

672 views8 pages

RAG Technics

Uploaded by

tumikosha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Progression of RAG Systems

Ever since its introduction in mid-2020, RAG approaches have followed a

progression aiming to achieve the redressal of the hallucination problem in LLMs

Naive RAG
At its most basic, Retrieval Augmented Generation can be summarized in three
steps -
1. Indexing of the documents
2. Retrieval of the context with respect to an input query
3. Generation of the response using the input query and retrieved context

LLM
Indexing
Documents

Response
Retrieval

Prompt

User Query

This basic RAG approach can also be termed “Naive RAG”

Challenges in Naive RAG

Retrieval Quality Augmentation Generation Quality
Low Precision leading Redundancy and Generations are not
to Repetition when grounded in the context
Hallucinations/Mid-air multiple retrieved Potential of toxicity and
drops documents have bias in the response
Low Recall resulting similar information Excessive dependence
in missing relevant Context Length on augmented context
info challenges
Outdated information

Abhinav Kimothi
Advanced RAG
To address the inefficiencies of the Naive RAG approach, Advanced RAG
approaches implement strategies focussed on three processes -

Pre-Retrieval Retrieval Post Retrieval

Documents User Query

Pre-Retrieval

Chunk Optimisation
Metadata Integration
Indexing Structure
Alignment

Indexing

Retrieval

Retrieval Fine-tuned Embeddings Iterative Retrieval Query Rewriting

Dynamic Embeddings Hybrid Search Sub Queries
Adapters HyDE Query Routing

Post Retrieval

Information Compression
Re-ranking
Prompt LLM

Response

* Indicative, non-exhaustive list

Abhinav Kimothi
Advanced RAG Concepts
Pre-retrieval/Retrieval Stage
Chunk Optimization
When managing external documents, it's important to break them into the right-
sized chunks for accurate results. The choice of how to do this depends on
factors like content type, user queries, and application needs. No one-size-fits-all
strategy exists, so flexibility is crucial. Current research explores techniques like
sliding windows and "small2big" methods

Metadata Integration
Information like dates, purpose, chapter summaries, etc. can be embedded into
chunks. This improves the retriever efficiency by not only searching the
documents but also by assessing the similarity to the metadata.

Indexing Structure
Introduction of graph structures can greatly enhance retrieval by leveraging
nodes and their relationships. Multi-index paths can be created aimed at
increasing efficiency.

Alignment
Understanding complex data, like tables, can be tricky for RAG. One way to
improve the indexing is by using counterfactual training, where we create
hypothetical (what-if) questions. This increases the alignment and reduces
disparity between documents.

Query Rewriting
To bring better alignment between the user query and documents, several
rewriting approaches exists. LLMs are sometimes used to create pseudo
documents from the query for better matching with existing documents.
Sometimes, LLMs perform abstract reasoning. Multi-querying is employed to
solve complex user queries.

Hybrid Search Exploration

The RAG system employs different types of searches like keyword, semantic and
vector search, depending upon the user query and the type of data available.

Abhinav Kimothi
Sub Queries
Sub querying involves breaking down a complex query into sub questions for
each relevant data source, then gather all the intermediate responses and
synthesize a final response.

Query Routing
A query router identifies a downstream task and decides the subsequent action
that the RAG system should take. During retrieval, the query router also identifies
the most appropriate data source for resolving the query.

Iterative Retrieval
Documents are collected repeatedly based on the query and the generated
response to create a more comprehensive knowledge base.

Recursive Retrieval
Recursive retrieval also iteratively retrieves documents. However, it also refines
the search queries depending on the results obtained from the previous retrieval.
It is like a continuous learning process.

Adaptive Retrieval
Enhance the RAG framework by empowering Language Models (LLMs) to
proactively identify the most suitable moments and content for retrieval. This
refinement aims to improve the efficiency and relevance of the information
obtained, allowing the models to dynamically choose when and what to retrieve,
leading to more precise and effective results

Hypothetical Document Embeddings (HyDE)

Using the Language Model (LLM), HyDE forms a hypothetical document (answer)
in response to a query, embeds it, and then retrieves real documents similar to
this hypothetical one. Instead of relying on embedding similarity based on the
query, it emphasizes the similarity between embeddings of different answers.

Fine-tuned Embeddings
This process involves tailoring embedding models to improve retrieval accuracy,
particularly in specialized domains dealing with uncommon or evolving terms. The
fine-tuning process utilizes training data generated with language models where
questions grounded in document chunks are generated.

Abhinav Kimothi
Post Retrieval Stage

Information Compression
While the retriever is proficient in extracting relevant information from extensive
knowledge bases, managing the vast amount of information within retrieval
documents poses a challenge. The retrieved information is compressed to extract
the most relevant points before passing it to the LLM.

Reranking
The re-ranking model plays a crucial role in optimizing the document set retrieved
by the retriever. The main idea is to rearrange document records to prioritize the
most relevant ones at the top, effectively managing the total number of
documents. This not only resolves challenges related to context window
expansion during retrieval but also improves efficiency and responsiveness.

Abhinav Kimothi
Modular RAG
The SOTA in Retrieval Augmented Generation is a modular approach which allows
components like search, memory, and reranking modules to be configured

Routing Modules

Search Predict

Retrieve Advanced

Rewrite Naive Rerank

Read

Demonstrate Fusion

Memory

Naive RAG is essentially a Retrieve -> Read approach which focusses on retrieving
information and comprehending it.
Advanced RAG is adds to the Retrieve -> Read approach by adding it into a
Rewrite and Rerank components to improve relevance and groundedness.
Modular RAG takes everything a notch ahead by providing flexibility and adding
modules like Search, Routing, etc.

Naive, Advanced & Modular RAGs are not exclusive approaches but a
progression. Naive RAG is a special case of Advanced which, in turn, is a special
case of Modular RAG

Abhinav Kimothi
Some RAG Modules
Search
The search module is aimed at performing search on different data sources. It is
customised to different data sources and aimed at increasing the source data for
better response generation

Memory
This module leverages the parametric memory capabilities of the Language Model
(LLM) to guide retrieval. The module may use a retrieval-enhanced generator to
create an unbounded memory pool iteratively, combining the "original question"
and "dual question." By employing a retrieval-enhanced generative model that
improves itself using its own outputs, the text becomes more aligned with the
data distribution during the reasoning process.

Fusion
RAG-Fusion improves traditional search systems by overcoming their limitations
through a multi-query approach. It expands user queries into multiple diverse
perspectives using a Language Model (LLM). This strategy goes beyond capturing
explicit information and delves into uncovering deeper, transformative
knowledge. The fusion process involves conducting parallel vector searches for
both the original and expanded queries, intelligently re-ranking to optimize
results, and pairing the best outcomes with new queries.

Extra Generation
Rather than directly fetching information from a data source, this module
employs the Language Model (LLM) to generate the required context. The content
produced by the LLM is more likely to contain pertinent information, addressing
issues related to repetition and irrelevant details in the retrieved content.

Task Adaptable Module

This module makes RAG adaptable to various downstream tasks allowing the
development of task-specific end-to-end retrievers with minimal examples,
demonstrating flexibility in handling different tasks.

Abhinav Kimothi
Other Blogs on RAG

Abhinav Kimothi

Generative AI in Practice
100% (13)
Generative AI in Practice
301 pages
AI Agents by Google
100% (8)
AI Agents by Google
42 pages
Agents Companion v2
100% (1)
Agents Companion v2
76 pages
Generative AI On AWS
100% (6)
Generative AI On AWS
208 pages
Mastering AI Agents
100% (4)
Mastering AI Agents
93 pages
Databricks Big Book of GenAI FINAL
100% (7)
Databricks Big Book of GenAI FINAL
118 pages
Building LLM Applications For Production
100% (3)
Building LLM Applications For Production
28 pages
Generative AI Apps With Langchain and Python - Rabi Jay
100% (1)
Generative AI Apps With Langchain and Python - Rabi Jay
387 pages
Building Effective Agents by Anthropic
No ratings yet
Building Effective Agents by Anthropic
12 pages
Agentic and Genai Aws GCP
75% (4)
Agentic and Genai Aws GCP
34 pages
RAG - A Simple Introduction
100% (5)
RAG - A Simple Introduction
75 pages
RAG Architecture
100% (8)
RAG Architecture
52 pages
A Taxonomy of Retrieval Augmented Generation
100% (2)
A Taxonomy of Retrieval Augmented Generation
56 pages
How To Use Generative AI To Create Hyper-Targeted Content
No ratings yet
How To Use Generative AI To Create Hyper-Targeted Content
10 pages
2014 DENSO Fuel Pump and Fuel Injector Catalog PDF
100% (2)
2014 DENSO Fuel Pump and Fuel Injector Catalog PDF
308 pages
Generative AI Usecases - A Comprehensive Guide - Dummies
100% (1)
Generative AI Usecases - A Comprehensive Guide - Dummies
19 pages
LangChain Cheat Sheet KDnuggets
No ratings yet
LangChain Cheat Sheet KDnuggets
1 page
7 Agentic RAG System Architectures To Build AI Agents
No ratings yet
7 Agentic RAG System Architectures To Build AI Agents
12 pages
Rag 1708257109
No ratings yet
Rag 1708257109
5 pages
26 RAG Concepts in Alphabetical Order
No ratings yet
26 RAG Concepts in Alphabetical Order
15 pages
LLM Evaluation
No ratings yet
LLM Evaluation
1 page
Vector Databases
No ratings yet
Vector Databases
35 pages
Create LLM Application Using Langchain With Ease
100% (5)
Create LLM Application Using Langchain With Ease
12 pages
Vector Database Essentials
No ratings yet
Vector Database Essentials
26 pages
Generative Ai Terminology
67% (3)
Generative Ai Terminology
26 pages
GenAI Interview Questions-Draft
No ratings yet
GenAI Interview Questions-Draft
27 pages
A Developer's Guide To Building AI Applications: Second Edition
100% (5)
A Developer's Guide To Building AI Applications: Second Edition
46 pages
Types of RAG: @bhavishya Pandit
No ratings yet
Types of RAG: @bhavishya Pandit
15 pages
LLM Application Through Production
100% (11)
LLM Application Through Production
254 pages
LLM Questions
100% (1)
LLM Questions
51 pages
LLM Applications
100% (1)
LLM Applications
1 page
Agentic AI Projects
33% (3)
Agentic AI Projects
9 pages
100 Generative AI Use Cases Examples For Industries
100% (6)
100 Generative AI Use Cases Examples For Industries
63 pages
LangGraph: Multi-Agent Systems
No ratings yet
LangGraph: Multi-Agent Systems
9 pages
Hands-On Guide To Agentic Corrective RAG-1
No ratings yet
Hands-On Guide To Agentic Corrective RAG-1
5 pages
Oracle Generative AI Services
No ratings yet
Oracle Generative AI Services
17 pages
What Are Vector Databases
No ratings yet
What Are Vector Databases
5 pages
LLM Mesh: A Practical Guide To Using Generative AI in The Enterprise
100% (1)
LLM Mesh: A Practical Guide To Using Generative AI in The Enterprise
27 pages
Improve Real-World RAG Systems
No ratings yet
Improve Real-World RAG Systems
43 pages
How Build A RAG Agent With LlamaIndex
No ratings yet
How Build A RAG Agent With LlamaIndex
4 pages
Generative AI Specialization Course
100% (1)
Generative AI Specialization Course
29 pages
RAG and AI Agents Simplified
No ratings yet
RAG and AI Agents Simplified
14 pages
Software Architecture in An AI World
No ratings yet
Software Architecture in An AI World
25 pages
Prompt Engineering
100% (1)
Prompt Engineering
33 pages
Best Practices For Fine-Tuning and Prompt Engineering LLMs - Weights & Biases LLM Whitepaper
50% (2)
Best Practices For Fine-Tuning and Prompt Engineering LLMs - Weights & Biases LLM Whitepaper
21 pages
Aryan A. What Is LLMOps. Large Language Models in Production 2024
100% (1)
Aryan A. What Is LLMOps. Large Language Models in Production 2024
67 pages
Introduction To Generative AI LLM
100% (1)
Introduction To Generative AI LLM
9 pages
Multi-Document Agentic RAG Using Llama-Index and Mistral - by Plaban Nayak - The AI Forum - May, 2024 - Medium
100% (1)
Multi-Document Agentic RAG Using Llama-Index and Mistral - by Plaban Nayak - The AI Forum - May, 2024 - Medium
24 pages
Building RAG-based LLM Applications For Production (Part 1) : Blog Detail
100% (1)
Building RAG-based LLM Applications For Production (Part 1) : Blog Detail
39 pages
300 LangChain Projects
100% (1)
300 LangChain Projects
17 pages
Agentic AI - Comprehensive Guide
100% (1)
Agentic AI - Comprehensive Guide
20 pages
3502 Generative AI A To Z
No ratings yet
3502 Generative AI A To Z
88 pages
Large Language Models
100% (1)
Large Language Models
23 pages
Exec Guide Gen Ai
100% (6)
Exec Guide Gen Ai
48 pages
Building A Streamlit Chatbot With LangChain and Llama 3.1 - Exploring LLMs - 3 - by Abou Zuhayr - Sep, 2024 - GoPenAI
No ratings yet
Building A Streamlit Chatbot With LangChain and Llama 3.1 - Exploring LLMs - 3 - by Abou Zuhayr - Sep, 2024 - GoPenAI
15 pages
Building A PDF Knowledge Bot With Open-Source LLMs - A Step-by-Step Guide - Shakudo
No ratings yet
Building A PDF Knowledge Bot With Open-Source LLMs - A Step-by-Step Guide - Shakudo
13 pages
AIML001 Generative AI On AWS - Build and Scale Generative AI Applications With Foundation Models
100% (1)
AIML001 Generative AI On AWS - Build and Scale Generative AI Applications With Foundation Models
28 pages
eBook-The Ultimate Guide To Using LLMs With Speech Recognition To Build Voice Apps
100% (1)
eBook-The Ultimate Guide To Using LLMs With Speech Recognition To Build Voice Apps
66 pages
A Simple Guide To Retrieval Augmented Generation 1720484135
No ratings yet
A Simple Guide To Retrieval Augmented Generation 1720484135
9 pages
Learning: Gen Ai
No ratings yet
Learning: Gen Ai
6 pages
RAG - Genai
No ratings yet
RAG - Genai
11 pages
Cloud Data Center Network Architectures and Technologies
No ratings yet
Cloud Data Center Network Architectures and Technologies
38 pages
State of Gaming 2023
No ratings yet
State of Gaming 2023
45 pages
Consumer Behavior 12th Edition Schiffman Test Bank PDF Download
100% (2)
Consumer Behavior 12th Edition Schiffman Test Bank PDF Download
52 pages
UI UX PRODUCT - Resume
No ratings yet
UI UX PRODUCT - Resume
1 page
BE-WI-248-01-F03 Work Hour Estimate Electrical and Instrumentation
No ratings yet
BE-WI-248-01-F03 Work Hour Estimate Electrical and Instrumentation
3 pages
Minireport18 Copy
No ratings yet
Minireport18 Copy
44 pages
SLG Module 6.2.3
No ratings yet
SLG Module 6.2.3
3 pages
Studio 2 - Circuits & PhysicalComputing - W24
No ratings yet
Studio 2 - Circuits & PhysicalComputing - W24
28 pages
Lecture 19marked
No ratings yet
Lecture 19marked
16 pages
5.35-OLCT100 User Manual
No ratings yet
5.35-OLCT100 User Manual
76 pages
Unit 3
No ratings yet
Unit 3
17 pages
Composite Landing Gear Components For Aerospace Applications
No ratings yet
Composite Landing Gear Components For Aerospace Applications
8 pages
Project Report
No ratings yet
Project Report
66 pages
2019
No ratings yet
2019
19 pages
SAP BW - Virtual Characteristic (Multiprovider & Infoset) - RSR - OLAP - BADI
No ratings yet
SAP BW - Virtual Characteristic (Multiprovider & Infoset) - RSR - OLAP - BADI
21 pages
LMS 20250221 202519
No ratings yet
LMS 20250221 202519
12 pages
Vision Lab
No ratings yet
Vision Lab
23 pages
Log Sheet Eng 2020
No ratings yet
Log Sheet Eng 2020
3 pages
Thesis Final 1
No ratings yet
Thesis Final 1
65 pages
10.3934 Mbe.2022033
No ratings yet
10.3934 Mbe.2022033
21 pages
Instructions For The Conduct of The Examination Wit12 June 2023
No ratings yet
Instructions For The Conduct of The Examination Wit12 June 2023
15 pages
Electronic Engines Support 7.9.0 Global-Guide 2022-06
No ratings yet
Electronic Engines Support 7.9.0 Global-Guide 2022-06
728 pages
Absolute SALE Deed
No ratings yet
Absolute SALE Deed
19 pages
Microsoft 365 Administration Learning Path For Beginners
No ratings yet
Microsoft 365 Administration Learning Path For Beginners
37 pages
What Is Steam Tracing
No ratings yet
What Is Steam Tracing
4 pages
FG 192 - Fo7 en
No ratings yet
FG 192 - Fo7 en
3 pages
GMDK - Ca-53 Seismic Interpretation and Stratigraphic Modeling
No ratings yet
GMDK - Ca-53 Seismic Interpretation and Stratigraphic Modeling
4 pages
Service Bulletin: 305-958 Fault Code and Jams
No ratings yet
Service Bulletin: 305-958 Fault Code and Jams
3 pages
ACTION RESEARCH Proposal
No ratings yet
ACTION RESEARCH Proposal
27 pages

RAG Technics

Uploaded by

RAG Technics

Uploaded by

Progression of RAG Systems

Ever since its introduction in mid-2020, RAG approaches have followed a

This basic RAG approach can also be termed “Naive RAG”

Challenges in Naive RAG

Pre-Retrieval Retrieval Post Retrieval

Documents User Query

Retrieval Fine-tuned Embeddings Iterative Retrieval Query Rewriting

* Indicative, non-exhaustive list

Hybrid Search Exploration

Hypothetical Document Embeddings (HyDE)

Rewrite Naive Rerank

Task Adaptable Module

You might also like