Embed
Embed, an AI Primitive by Langbase, allows you to convert text into vector embeddings. This is particularly useful for semantic search, text similarity comparisons, and other NLP tasks.
Embedding text into vectors enables you to perform complex queries and analyses that go beyond simple keyword matching. It allows you to capture the semantic meaning of the text, making it easier to find relevant information based on context rather than just keywords.
Quickstart: Converting Text to Vector Embeddings
Let's get started
In this guide, we'll use the Langbase SDK to interact with the Embed API:
Step #1Generate Langbase API key
Every request you send to Langbase needs an API key. This guide assumes you already have one. If not, please check the instructions below.
Step #2Setup your project
Create a new directory for your project and navigate to it.
Project setup
mkdir text-embedder && cd text-embedder
Initialize the project
Create a new Node.js project.
Initialize project
npm init -y
Install dependencies
You will use the Langbase SDK to work with Embed and dotenv
to manage environment variables.
Install dependencies
npm i langbase dotenv
Create an env file
Create a .env
file in the root of your project and add your Langbase API key:
.env
LANGBASE_API_KEY=your_api_key_here
Step #3Create an embedding generator
Let's create a file named generate-embeddings.ts
in your project directory that will demonstrate how to generate embeddings for text chunks:
generate-embeddings.ts
import 'dotenv/config';
import { Langbase } from 'langbase';
const langbase = new Langbase({
apiKey: process.env.LANGBASE_API_KEY!,
});
async function main() {
// Define some text chunks to embed
const textChunks = [
"Artificial intelligence is transforming how we interact with technology",
"Machine learning algorithms can identify patterns in large datasets",
"Natural language processing helps computers understand human language",
"Vector embeddings represent text as points in a high-dimensional space"
];
try {
// Generate embeddings
const embeddings = await langbase.embed({
chunks: textChunks,
// Optional: specify the embedding model
// embeddingModel: "openai:text-embedding-3-large"
});
console.log("Number of embeddings generated:", embeddings.length);
console.log("First embedding (showing first 5 dimensions):", embeddings[0].slice(0, 5));
console.log("Embedding dimensions:", embeddings[0].length);
// Log the full first embedding vector
console.log("\nComplete first embedding vector:");
console.log(embeddings[0]);
} catch (error) {
console.error("Error generating embeddings:", error);
}
}
main();
Step #4Run the script
Run the script to generate embeddings for your text chunks:
Run the script
npx tsx generate-embeddings.ts
You should see output showing the number of embeddings generated, a sample of the first embedding vector, and the full details of the first embedding vector:
Number of embeddings generated: 4
First embedding (showing first 5 dimensions): [-0.023, 0.128, -0.194, 0.067, -0.022]
Embedding dimensions: 1536
Complete first embedding vector:
[-0.023, 0.128, -0.194, 0.067, -0.022, ... ]
Next Steps
- Build a semantic search system using your embeddings
- Combine with other Langbase primitives like Chunk to process documents before embedding
- Create a RAG (Retrieval-Augmented Generation) system using your embedded documents
- Join our Discord community for feedback, requests, and support