Parser

Parser, an AI Primitive by Langbase, allows you to extract text content from various document formats. This is particularly useful when you need to process documents before using them in your AI applications.

Parser can handle a variety of formats, including PDFs, CSVs, and more. By converting these documents into plain text, you can easily analyze, search, or manipulate the content as needed.


Quickstart: Extracting Text from Documents


Let's get started

In this guide, we'll use the Langbase SDK to interact with the Parser API:


Step #1Generate Langbase API key

Every request you send to Langbase needs an API key. This guide assumes you already have one. If not, please check the instructions below.


Step #2Setup your project

Create a new directory for your project and navigate to it.

Project setup

mkdir document-parser && cd document-parser

Initialize the project

Create a new Node.js project.

Initialize project

npm init -y

Install dependencies

You will use the Langbase SDK to work with Parser and dotenv to manage environment variables.

Install dependencies

npm i langbase dotenv

Create an env file

Create a .env file in the root of your project and add your Langbase API key:

.env

LANGBASE_API_KEY=your_api_key_here

Step #3Parse a PDF document

Now, let's create a file named parse-pdf.ts to demonstrate how to parse the document. You can download a sample PDF document from the below.

Download Composable AI PDF

Move the downloaded PDF document to your project directory.

parse-pdf.ts

import 'dotenv/config'; import { Langbase } from 'langbase'; import { readFile } from 'fs/promises'; const langbase = new Langbase({ apiKey: process.env.LANGBASE_API_KEY!, }); async function main() { try { // Read the PDF document const buffer = await readFile('composable-ai.pdf'); // Parse the PDF document const result = await langbase.parser({ document: buffer, documentName: 'composable-ai.pdf', contentType: 'application/pdf', }); console.log('Parsed document name:', result.documentName); console.log('Parse document content:', result.content); } catch (error) { console.error('Error parsing PDF:', error); } } main();

Run the script to parse your document:

Run the script

npx tsx parse-pdf.ts

You should see output similar to this:

Parsed document name: composable-ai.pdf Parse document content: Composable AI In software engineering, composition is a powerful concept. It allows for building complex systems from simple, interchangeable parts. Think Legos, Docker containers, React components. Langbase extends this concept to AI infrastructure with our Composable AI stack using Pipes and Memory. Composable and personalized AI : With Langbase, you can compose multiple models together into pipelines. It's easier to think about, easier to develop for, and each pipe lets you choose which model to use for each task. You can see cost of every step. And allow your customers to hyper-personalize. Effortlessly zero-config AI infra : Maybe you want to use a smaller, domain-specific model for one task, and a larger general-purpose model for another task. Langbase makes it easy to use the right primitives and tools for each part of the job and provides developers with a zero-config composable AI infrastructure. That's a nice way of saying, you get a unicorn-scale API in minutes, not months . The most common problem I hear about in Gen AI space is that my AI agents are too complex and I can't scale them, too much AI talking to AI. I don't have control, I don't understand the cost, and the impact of this change vs that. Time from new model to prod is too long. Feels static, my customers can't personalize The Developer Friendly Future of AI Infrastructure Why Composable AI? Command.new Launching soon in limited beta Join the waitlist Langbase it. Langbase fixes all this. AA I have built an AI email agent that can read my emails, understand the sentiment, summarize, and respond to them. Let's break it down to how it works, hint several pipes working together to make smart personalized decisions. 1. I created a pipe: email-sentiment this one reads my emails to understand the sentiment 2. email-summarizer pipe it summarizes my emails so I can quickly understand Example: Composable AI Email Agent Langbase Email Agent reference architecture them 3. email-decision-maker pipe should I respond? is it urgent? is it a newsletter? 4. If email-decision-maker pipe says yes , then I need to respond. This invokes the final pipe 5. email-writer pipe writes a draft response to my emails with one of the eight formats I have Ah, the power of composition. I can swap out any of these pipes with a new one. Flexibility : Swap components without rewriting everything Reusability : Build complex systems from simple, tested parts Scalability : Optimize at the component level for better performance Observability : Monitor and debug each step of your AI pipeline Control flow Maybe I want to use a different sentiment analysis model Or maybe I want to use a different summarizer when I'm on vacation I can chose a different LLM (small or large) based on the task BTW I definitely use a different decision-maker pipe on a busy day. Extensibility Add more when needed : I can also add more pipes to this pipeline. Maybe I want to add a pipe that checks my calendar or the weather before I respond to an email. You get the idea. Always bet on composition. Eight Formats to write emails : And I have several formats. Because Pipes are composable, I have eight different versions of email-writer pipe. I have a pipe email-pick-writer that picks the correct pipe to draft a response with. Why? I talk to my friends differently than my investors, reports, managers, vendors you name it. Why Composable AI is powerful? Long-term memory and context awareness By the way, I have all my emails in an emails-store memory, which any of these pipes can refer to if needed. That's managed semantic RAG over all the emails I have ever received. And yes, my emails-smart-spam memory knows all the pesky smart spam emails that I don't want to see in my inbox. Cost & Observability Because each intent and action is mapped out Pipe which is an excellent primitive for using LLMs, I can see everything related to cost, usage, and effectiveness of each pipe. I can see how many emails were processed, how many were responded to, how many were marked as spam, etc. I can switch LLMs for any of these actions, fork a pipe, and see how it performs. I can version my pipes and see how the new version performs against the old one. And we're just getting started Why Developers Love It Modular : Build, test, and deploy pipes x memorysets independently Extensible : API-first no dependency on a single language Version Control Friendly : Track changes at the pipe level Cost-Effective : Optimize resource usage for each AI task Stakeholder Friendly : Collaborate with your team on each pipe and memory. All your R&D team, engineering, product, GTM (marketing, sales), and even stakeholders can collaborate on the same pipe. It's like a Google Doc x GitHub for AI. That's what makes it so powerful. Each pipe and memory are like a docker container. You can have any number of pipes and memorysets. Can't wait to share more exciting examples of composable AI. We're cookin!! We'll share more on this soon. Follow us on Twitter and LinkedIn for updates. Previous Introduction Next API Reference Langbase, Inc. © Copyright 2025. All rights reserved.

Next Steps

  • Try parsing different file formats using the Langbase Parser API
  • Integrate the parsed content with other Langbase features
  • Build something cool with Langbase SDK and APIs.
  • Join our Discord community for feedback, requests, and support