RAG stands for Retrieval Augmented Generation. An LLM system using this method consists of two main components:
First, relevant data is retrieved from a data source
Then the LLM uses this information to generate an answer to a query
Why does RAG exist?
A Large Language Model only knows the information with which it was trained. This means that data that was published after the training time or was not included in the original training data - such as internal company documents - is unknown to the model.
RAG was developed precisely for such cases in order to supplement an LLM with up-to-date information and thus give it access to areas of knowledge for which it was not directly trained.
What is RAG good for?
It can access current or specialized data, e.g. company knowledge, documents, websites, etc.
It gives better and more accurate answers because it gets the information from real sources.
It is more flexible: you don't have to retrain the model when you have new data - you just have to enter it into the search system.
Example with a slightly technical background
If we want to provide a language model (LLM) with information from a book that it does not know, it would be inefficient to simply insert the entire book text - that would be too long and would exceed the token limit.
Instead, a process is used in which only the most relevant parts of the book are selected. To do this, the book text is first broken down into small sections and each of these sections is converted into a number vector (embedding) using an embedding model.
When a user asks a question, this question is also converted into a vector. This vector is then compared with the vectors of the book sections - the more similar they are, the more likely they are to match in terms of content.
The best matching sections are then passed back to the language model in text form as context. This allows the LLM to answer the question as if it had access to the book - without the book actually being included in the model training.
I hope I was able to explain RAG to you in a simple way. If you have any questions, please contact me.
Best regards
Thomas from the Quantyverse
P.S.: Visit my website Quantyverse.ai for products, bonus content, blog posts and more