What is RAG?

RAG stands for Retrieval Augmented Generation - an advanced AI technique that combines information retrieval with generative AI to provide accurate, context-aware responses.

How RAG Works

RAG works through a sophisticated 7-step process:

1️⃣ Document Loading

The PDF document is loaded and parsed using PyPDFLoader to extract all text content from the document pages.

2️⃣ Text Chunking

The extracted text is split into smaller chunks (12,000 characters) with overlaps (1,500 characters) to maintain context across boundaries.

3️⃣ Embedding Generation

Each text chunk is converted into a dense vector representation (embedding) that captures semantic meaning using Google's Gemini model.

4️⃣ Vector Database Storage

These embeddings are stored in a vector database, allowing for efficient similarity search based on semantic content rather than exact keyword matching.

5️⃣ Query Processing

When you ask a question, your query is converted into an embedding and compared against all stored document embeddings to find the most relevant chunks.

6️⃣ Retrieval

The system retrieves the top-k most semantically similar chunks (top 4 in this implementation) that contain information related to your query.

7️⃣ Context-Aware Generation

The retrieved chunks are combined with your question and sent to the Gemini model, which generates an accurate answer based on the document's context.

Benefits of RAG

Technology Stack

This application is built using the following technologies:

Implementation Details

In this application:

Meet the Team

Sachethan V

AIML,Global Academy of Technology

Led the development of the RAG model architecture and deployed the application on Google Cloud Run, ensuring scalable and efficient document processing.

Harshitha C

AIML,Global Academy of Technology

Designed and implemented the interactive user interface, seamlessly integrating frontend components with backend API calls for a smooth user experience.