KI in Unternehmen RAG
KI in Unternehmen RAG

Rethinking RAG: A fresh approach to leveraging context for AI-powered solutions

Guest contribution by
Image_Capa
Francisco Capa, Data Enginer COCUS

By rethinking how we apply Retrieval-Augmented Generation (RAG), businesses can unlock new levels of AI integration that are both practical and powerful. The concept of RAG holds significant potential for simplicity, accuracy and cost efficiency in AI-powered solutions and maximizes the value of existing data. Here you can find out about the attractive options RAG offers for a wide range of applications.

What is Retrieval-Augmented Generation (RAG)?

Retrieval-Augmented Generation is a concept that many of us have encountered without even realizing we are using it. If you ever used a Large Language Model (LLM) like ChatGPT or Gemini and provided it with specific context before asking a question, you’ve used a form of RAG. At its core, RAG involves supplying context or additional information to an LLM to guide and refine its responses.

As an example, consider the following simple prompt using ChatGPT 3.5 as the LLM for testing this concept: “Write a small paragraph about quantum physics.”. The answers will be limited to 50 words for simplicity. The model gives us a standard, informative response as expected.

„Quantum physics is the branch of science that studies the smallest particles in the universe, like atoms and photons. It reveals how these particles behave in surprising ways, differing from the laws of classical physics. Quantum mechanics underlies the fundamental principles of how particles move, interact, and give rise to the properties of matter, highlighting phenomena such as superposition and entanglement.“

But what if the context changes? Let’s now ask the same question with the following context: “The audience are children”. The LLM’s response is now slightly different, adapting the answer to the new context while trying to keep the information accurate and relevant.

„Quantum physics is like a magical rulebook for the tiniest things in the universe, such as atoms and light particles. It tells us that these tiny bits can do surprising tricks, like being in two places at once or talking to each other instantly over long distances, making the world a mysterious place!“

This adaptability highlights the power of RAG, and it’s not limited to just text. This works in pretty much every type of LLM from text-to-text to text-to-code and even text-to-image.

As an example, imagine using an AI image generator like DALL-E, where you upload a picture of your dog and ask it to place it in a different scenario. The resulting image is not perfect, but it will still recognize the breed and colour, giving context to the image you want to generate. So, context can be provided in a multitude of ways.

RAG Dog1
RAG Dog2

The Real-World Benefits of RAG in Business

The concept of RAG extends beyond playful interactions with AI. When applied for AI in companies, it can be used to unlock significant advantages:

While retraining or fine-tuning the base model are valid approaches depending on the use case, they are not mutually exclusive with RAG. In fact, combining these strategies or choosing the most appropriate one based on the use case can lead to optimal results.

A Practical Example for AI in Companies: Customer Service

To illustrate the practical application of RAG, let’s consider a customer service scenario, keeping in mind that the example is very simple, but the approach could be used in a much more complex data model logic. Imagine a customer asking a language model when their next payment is due (this can be done via a simple text chat or a more complex voice-to-text system). How would the model know this?

One option is to pre-train the model on every customer’s contract information. However, this would require frequent retraining and could become expensive very fast. A more efficient approach is to use RAG – provide the model with the relevant contract details stored in an external source (e.g., a SQL database) on-the-fly, enabling it to answer the question accurately. However, this approach presents a challenge: How do we determine which specific data the model needs?

A New Approach: Layered RAG

The typical solution to this problem involves using semantic search to filter the data before feeding it to the model. This process involves converting both the data points and the query into model embeddings, followed by a semantic search to identify the most relevant information. While effective, this method can become a bottleneck, especially when dealing with structured data that is easily filtered because of the format nature. Here’s where a different approach comes into play. A layered RAG strategy.

Instead of relying solely on semantic search, we can eliminate this step by directly leveraging the structured schema of our data. By providing the schema as context to the LLM, we can ask it to identify the necessary data points needed to answer the query. Once identified, we simply query our external data sources, supply the relevant context, and allow the model to generate the final response. So, in a way we are using RAG to feed the next level of RAG.

This should only be an option where the data structure is known prior to the question, this would not work if the context is images or documents for example because the data structure is unknown at that point.

Revisiting the Customer Service Example: Benefits of RAG for AI in Companies

Let’s revisit the customer service example. Suppose a customer asks about their next payment due date. Instead of performing a semantic search on a vast amount of unstructured data, we could explain to the model that our database contains several structured tables.

The model could then determine that it needs information from the “contract” and “transaction” tables. We query these tables, provide the relevant data points, and the model responds accordingly. This approach is straightforward, cost-effective, and well-suited for business cases involving structured data.

Successful RAG Application

RAG offers significant potential for integrating LLMs into business processes to improve efficiency and accuracy in AI-powered solutions. We support your organization to make the implementation efficient, secure and cost-effective and ensure that the full potential of your data is realized.

Connecting Data –
Empowering Innovation

Share this post