RAG is a powerful technique used to enhance the knowledge and capabilities of Large Language Models (LLMs).
LLMs can be crudely split into 2 parts.
- The actual model that can consume and process text intelligently (Natural Language Processing)
- The data that it consumes. a. The user’s question b. The reference data that the model will use to answer that question. Open AI has built a large repository of such data by scraping the internet.
Now coming to what the role of RAG,
RAG is the process of adding a very specific document or documents to the “Data” part of the LLM. And modifying the settings of the LLMs to use that document as the key reference point when answering a question.
Imagine an LLM as a student preparing for an exam. The LLM has a vast amount of general knowledge stored in its “brain,” similar to what the student has learned through textbooks and lectures. However, for specific questions or tasks, the LLM might need additional information, just like the student might need to consult reference materials during the exam. This is where RAG comes in. It acts as a tutor or research assistant for the LLM.
When a user asks a question, the LLM with RAG settings will do the following:
- Retrieval: The LLM analyses the prompt and identifies relevant pieces of information from the reference material and the conversation history.
- Generation: The retrieved information is then incorporated into the prompt for the LLM. This augmented prompt provides the LLM with the necessary context. It then consumes and processes the new augmented prompt, just like any other normal prompt. But since the prompt was augmented, it has specific instructions on how the LLM should process the data and that it should give significant attention to the newly augmented prompt context.
This is why prompt engineering is becoming more and more important.
Benefits of using RAG:
- More accurate and informative responses: By providing the LLM with additional context, RAG helps it generate more precise and relevant outputs, especially for tasks involving domain-specific knowledge or real-time data.
- Flexibility and adaptability: RAG can be used with various LLMs and data sources.
- Continuous learning: The ability to incorporate new information from external sources allows RAG-powered LLMs to stay up-to-date and continuously learn, unlike traditional LLMs that are limited to the data they were trained on.