Embed

Embed, an AI Primitive by Langbase, allows you to convert text into vector embeddings. This is particularly useful for semantic search, text similarity comparisons, and other NLP tasks.

Embedding text into vectors enables you to perform complex queries and analyses that go beyond simple keyword matching. It allows you to capture the semantic meaning of the text, making it easier to find relevant information based on context rather than just keywords.


Quickstart: Converting Text to Vector Embeddings


Let's get started

In this guide, we'll use the Langbase SDK to interact with the Embed API:


Step #1Generate Langbase API key

Every request you send to Langbase needs an API key. This guide assumes you already have one. If not, please check the instructions below.

Embedding Models API Keys

Please add the LLM API keys for the embedding models you want to use in your API key settings.


Step #2Setup your project

Create a new directory for your project and navigate to it.

Project setup

mkdir text-embedder && cd text-embedder

Initialize the project

Create a new Node.js project.

Initialize project

npm init -y

Install dependencies

You will use the Langbase SDK to work with Embed and dotenv to manage environment variables.

Install dependencies

npm i langbase dotenv

Create an env file

Create a .env file in the root of your project and add your Langbase API key:

.env

LANGBASE_API_KEY=your_api_key_here

Step #3Create an embedding generator

Let's create a file named generate-embeddings.ts in your project directory that will demonstrate how to generate embeddings for text chunks:

generate-embeddings.ts

import 'dotenv/config';
import { Langbase } from 'langbase';

const langbase = new Langbase({
  apiKey: process.env.LANGBASE_API_KEY!,
});

async function main() {
  // Define some text chunks to embed
  const textChunks = [
    "Artificial intelligence is transforming how we interact with technology",
    "Machine learning algorithms can identify patterns in large datasets",
    "Natural language processing helps computers understand human language",
    "Vector embeddings represent text as points in a high-dimensional space"
  ];

  try {
    // Generate embeddings
    const embeddings = await langbase.embed({
      chunks: textChunks,
      // Optional: specify the embedding model
      // embeddingModel: "openai:text-embedding-3-large"
    });

    console.log("Number of embeddings generated:", embeddings.length);
    console.log("First embedding (showing first 5 dimensions):", embeddings[0].slice(0, 5));
    console.log("Embedding dimensions:", embeddings[0].length);

    // Log the full first embedding vector
    console.log("\nComplete first embedding vector:");
    console.log(embeddings[0]);
  } catch (error) {
    console.error("Error generating embeddings:", error);
  }
}

main();

Step #4Run the script

Run the script to generate embeddings for your text chunks:

Run the script

npx tsx generate-embeddings.ts

You should see output showing the number of embeddings generated, a sample of the first embedding vector, and the full details of the first embedding vector:

Number of embeddings generated: 4
First embedding (showing first 5 dimensions): [-0.023, 0.128, -0.194, 0.067, -0.022]
Embedding dimensions: 1536

Complete first embedding vector:
[-0.023, 0.128, -0.194, 0.067, -0.022, ... ]

Next Steps

  • Build a semantic search system using your embeddings
  • Combine with other Langbase primitives like Chunk to process documents before embedding
  • Create a RAG (Retrieval-Augmented Generation) system using your embedded documents
  • Join our Discord community for feedback, requests, and support