Real time RAG with context and meta

All LLMs share a common limitation: they are trained on data that may be outdated. As a result, when a user asks for information such as the current pricing of XYZ company, LLMs may either generate inaccurate (hallucinated) responses or state that this information is not available in their training data.

Langbase AI Memory agent is designed to overcome this limitation by providing real-time data to LLMs, enabling them to answer user questions accurately. Together with AI agent pipes, Langbase allows users to seamlessly build real-time Retrieval-Augmented Generation (RAG) systems, complete with contextual data and metadata.

In this guide, let's take a look at how we can build a real-time AI memory that will use your data to answer user queries.

Step #0

We will be building an AI memory agent using Langbase. So please go ahead and create an account on Langbase.

Step #1

Now let's set up a Node project. To do it, run the following command in your project terminal:

npm init -y

This will create a package.json file with basic information. We will use the dotenv package to read environment variables in our code. So let's install it.

npm install dotenv

Lastly, let's create an index.js file in our project directory. This will contain all the necessary code we will write to build an AI memory agent.

Step #2

The next step is to generate a user API key which you can do here. We will use this key to authenticate our Langbase API requests.

Now go ahead and create a .env file in your project directory and add your API key there.

LANGBASE_API_KEY=<REPLACE_WITH_LANGBASE_API_KEY>

Step #3

Now we will create an AI memory agent on Langbase using API. Let's copy the Node.js code from the docs into our index.js file.

import 'dotenv/config'; async function createNewMemory() { const url = 'https://api.langbase.com/v1/memory'; const memory = { name: 'ai-memory-agent', description: 'This is an AI memory agent created by the Langbase API.', }; const response = await fetch(url, { method: 'POST', headers: { 'Content-Type': 'application/json', Authorization: `Bearer ${process.env.LANGBASE_API_KEY}`, }, body: JSON.stringify(memory), }); const newMemory = await response.json(); return newMemory; }

Step #4

AI memory agents can contain a wide range of data from text files, pdfs, code files, markdowns, and more. I will be uploading a markdown file to the AI memory we just created.

Let's upload two docs in our AI memory. We can use Gemini documentation around structured outputs and fine tuning. I have already downloaded these pages as markdown. You can download them from here.

I have placed the documents inside the docs directory of my project.

Generate signed URL

To upload a document in memory, let's first generate a signed URL. We will later request this URL to upload the document.

async function getSignedUploadUrl({ name, href }) { const url = 'https://api.langbase.com/v1/memory/documents'; const newDoc = { memoryName: 'ai-memory-agent', ownerLogin: 'saadirfan', fileName: name, meta: { href, source: 'Google', }, }; const response = await fetch(url, { method: 'POST', headers: { 'Content-Type': 'application/json', Authorization: `Bearer ${process.env.LANGBASE_API_KEY}`, }, body: JSON.stringify(newDoc), }); const res = await response.json(); return res.signedUrl; }

Upload docs

Now let's write a function to request this generated signed URL to upload documents in AI memory.

async function uploadDocument(signedUrl, filePath) { const file = fs.readFileSync(filePath); const response = await fetch(signedUrl, { method: 'PUT', headers: { 'Content-Type': 'text/markdown', }, body: file, }); return response; }

Step #5

Lastly, let's ask a question from our AI memory to retrieve chunks that can contain the answer.

const structuredOutputPrompt = `What kind of data gemini generates by default?`; const fineTuningPrompt = `How does fine-tuning work in gemini?`; async function retrieveSimilarChunks(query) { const url = 'https://api.langbase.com/v1/memory/retrieve'; const response = await fetch(url, { method: 'POST', headers: { 'Content-Type': 'application/json', Authorization: `Bearer ${process.env.LANGBASE_API_KEY}`, }, body: JSON.stringify({ query, memory: [{ name: 'ai-memory-agent' }], topK: 2 }), }); const result = await response.json(); return result; }

Here is what the final code will look like:

const fs = require('fs'); const path = require('path'); require('dotenv/config'); async function createNewMemory() { const url = 'https://api.langbase.com/v1/memory'; const memory = { name: 'ai-memory-agent', description: 'This is an AI memory agent created by the Langbase API.', }; const response = await fetch(url, { method: 'POST', headers: { 'Content-Type': 'application/json', Authorization: `Bearer ${process.env.LANGBASE_API_KEY}`, }, body: JSON.stringify(memory), }); const newMemory = await response.json(); return newMemory; } async function getSignedUploadUrl({ name, href }) { const url = 'https://api.langbase.com/v1/memory/documents'; const newDoc = { memoryName: 'ai-memory-agent', ownerLogin: 'saadirfan', fileName: name, meta: { href, source: 'Google', }, }; const response = await fetch(url, { method: 'POST', headers: { 'Content-Type': 'application/json', Authorization: `Bearer ${process.env.LANGBASE_API_KEY}`, }, body: JSON.stringify(newDoc), }); const res = await response.json(); return res.signedUrl; } async function uploadDocument(signedUrl, filePath) { const file = fs.readFileSync(filePath); const response = await fetch(signedUrl, { method: 'PUT', headers: { 'Content-Type': 'text/markdown', }, body: file, }); return response; } async function retrieveSimilarChunks(query) { const url = 'https://api.langbase.com/v1/memory/retrieve'; const response = await fetch(url, { method: 'POST', headers: { 'Content-Type': 'application/json', Authorization: `Bearer ${process.env.LANGBASE_API_KEY}`, }, body: JSON.stringify({ query, memory: [{ name: 'ai-memory-agent' }], topK: 2 }), }); const result = await response.json(); return result; } const structuredOutputPrompt = `What kind of data gemini generates by default?`; const fineTuningPrompt = `How does fine-tuning work in gemini?`; (async function () { const newMemory = await createNewMemory(); const src = path.join(__dirname, 'docs'); const files = fs.readdirSync(src); for (const file of files) { const filePath = path.join(src, file); let href = ''; if (file === 'gemini-structured-outputs.md') { href = 'https://ai.google.dev/gemini-api/docs/structured-output'; } else { href = 'https://ai.google.dev/gemini-api/docs/model-tuning'; } const signedUrl = await getSignedUploadUrl({ name: file, href }); await uploadDocument(signedUrl, filePath); } const structuredOutputResult = await retrieveSimilarChunks(structuredOutputPrompt); console.log(JSON.stringify(structuredOutputResult, null, 2)); const fineTuningResult = await retrieveSimilarChunks(fineTuningPrompt); console.log(JSON.stringify(fineTuningResult, null, 2)); })();

Step #6

Lastly, we will run our index.js file. It will create a memory, upload markdown documents inside it along with their metadata and then finally retrieve chunks from the memory for different user queries.

node index.js

Here is what the response will look like for the user query: What kind of data gemini generates by default?:

[ { "text": "Title: Generate structured output with the Gemini API\n\nURL Source: https://ai.google.dev/gemini-api/docs/structured-output?lang=node\n\nMarkdown Content:\nGemini generates unstructured text by default, but some applications require structured text. For these use cases, you can constrain Gemini to respond with JSON, a structured data format suitable for automated processing. You can also constrain the model to respond with one of the options specified in an enum.\n\nHere are a few use cases that might require structured output from the model:\n\n* Build a database of companies by pulling company information out of newspaper articles.\n* Pull standardized information out of resumes.\n* Extract ingredients from recipes and display a link to a grocery website for each ingredient.", "similarity": 0.5605214238166809, "meta": { "docName": "gemini-structured-outputs.md", "href": "https://ai.google.dev/gemini-api/docs/structured-output", "source": "Google" } }, { "text": "* Build a database of companies by pulling company information out of newspaper articles.\n* Pull standardized information out of resumes.\n* Extract ingredients from recipes and display a link to a grocery website for each ingredient.\n\nIn your prompt, you can ask Gemini to produce JSON-formatted output, but note that the model is not guaranteed to produce JSON and nothing but JSON. For a more deterministic response, you can pass a specific JSON schema in a [`responseSchema`](https://ai.google.dev/api/rest/v1beta/GenerationConfig#FIELDS.response_schema) field so that Gemini always responds with an expected structure.", "similarity": 0.5243381261825562, "meta": { "docName": "gemini-structured-outputs.md", "href": "https://ai.google.dev/gemini-api/docs/structured-output", "source": "Google" } } ]

As you can see that we have text chunks along with their metadata and similarity score. This data can then be sent to the LLM along with the user query. LLM will use this information to generate the response.

That's all from this guide. Now you can use Langbase AI agent memory to build AI apps.

We have also written a guide on how you can build multi-agent AI support that you can find here.