What is Memory on ⌘ Langbase?

Memory is a managed search engine as an API for developers. Our long-term memory solution has the ability to acquire, process, retain, and later retrive information. It combines vector storage, RAG (Retrieval-Augmented Generation), and internet access to help you build powerful AI features and products.

Imagine your private version of Google search that can search your private data or the internet to help any LLM answer queries accurately.


Core functionality

  • Ingest: Upload documents, files, and web content to context
  • Process: Automatically extract, embed, and create semantic index
  • Query: Recall and retrieve relevant context using natural language queries
  • Accuracy: Reduce hallucinations with accurate context-aware information

Key features

  • Semantic understanding: Go beyond keyword matching with context-aware search
  • Vector storage: Efficient hybrid similarity search for large-scale data
  • Semantic RAG: Enhance LLM outputs with retrieved information from memory
  • Internet access: Augment your private data with up-to-date web content
Why?

All Large Language Models (LLMs) share one limitation, i.e. Hallucination.

LLMs don't know anything about your private data. They are trained on public data and they hallucinate when you ask about things they don't have the answers to.

This limitation makes it difficult for LLMs to provide accurate responses to your queries. ⌘ Langbase long-term memory solve this problem by providing a way to attach your private data to any LLM.


In a Retrieval Augmented Generation (RAG) system, Memory is used with Pipe to retrieve relevant data for queries.

The process involves creating query embeddings, retrieving matching information from Memory, augmenting the query with this data, and using it to generate accurate, context-aware responses. This integration ensures precise answers and enables use cases like documents summarization, question-answering, and more.


Semantic Retrieval Augmented Generation (sRAG)

In a semantic RAG system, when an LLM is queried, it is provided with additional information relevant to the query from the memory. This additional information helps the LLM to provide more accurate and relevant responses.

Below is the list of steps performed in a RAG system:

  1. Query: User queries the LLM through Pipe. Embeddings are generated for the query.
  2. Retrieval: Pipe retrieves query-relevant information from the memory through similarity search.
  3. Augmentation: Retrieved information is augmented with the query.
  4. Generation: The augmented information is fed to the LLM to generate a response.

Next steps

Time to build. Check out the quickstart overview example or Explore the API reference.