Build an AI Video Wisdom Extraction Tool

A step-by-step guide to build an AI Video Wisdom Extraction Tool using Langbase SDK.

In this guide, we will build an AI Video Wisdom Extraction Tool using the Langbase SDK. This tool will:

Extract wisdom from a YouTube video
Answer questions related to the video content
Generate a summary of the video
List main ideas and key points
Extract quotes and key phrases
Provide a list of references and resources
Highlight the wow moments in the video
Write Tweets from the video content

We will create a basic Next.js application that will use the Langbase SDK to connect to the AI Pipes and stream the final response back to user.

Let's get started!

Step #0Create a Next.js Application

To build the agent, we need to have a Next.js starter application. If you don't have one, you can create a new Next.js application using the following command:

npx create-next-app@latest video-wisdom

# or with pnpm
pnpx create-next-app@latest video-wisdom

This will create a new Next.js application in the video-wisdom directory. Navigate to the directory and start the development server:

cd video-wisdom
npm run dev

# or with pnpm
pnpm run dev

Step #1Install Langbase SDK

Install the Langbase SDK in this project using npm or pnpm.

npm install langbase

# or with pnpm
pnpm add langbase

Step #2Fork the AI pipes

Fork the following AI Pipes in ⌘ Langbase dashboard. These Pipes will power the Video Wisdom Extraction Tool:

YouTube Videos Q/A Pipe - Answers questions related to the video content.
Summarize YouTube Video Pipe - Generates a summary of the video.
Main Ideas Extractor Pipe - Lists main ideas and key points.
List Interesting Facts Pipe - Extracts interesting facts from the video.
Wow Moments Extractor Pipe - Highlights the wow moments in the video.
Video Tweets Extractor Pipe - Writes Tweets from the video content.
Video Recommendations Extractor Pipe - Provides a list of references and resources from the video.
List Quotes from Video Pipe - Extracts quotes and key phrases from the video.

When you fork a Pipe, navigate to the API tab located in the Pipe's navbar. There, you'll find API keys specific to each Pipe, which are essential for making calls to the Pipes using the Langbase SDK.

Create a .env.local file in the root directory of your project and add the following environment variables:

# Replace `LB_SUMMARIZE_PIPE_KEY` with your API from the forked Summary Pipe
LB_SUMMARIZE_PIPE_KEY=""

# Replace `LB_GENERATE_PIPE_KEY` with your API from the forked YouTube Videos Q/A Pipe
LB_GENERATE_PIPE_KEY=""

# Replace `LB_MAIN_IDEAS_PIPE_KEY` with your API from the forked Main Ideas Extractor Pipe
LB_MAIN_IDEAS_PIPE_KEY=""

# Replace `LB_FACTS_PIPE_KEY` with your API from the forked List Interesting Facts Pipe
LB_FACTS_PIPE_KEY=""

# Replace `LB_WOW_PIPE_KEY` with your API from the forked Wow Moments Extractor Pipe
LB_WOW_PIPE_KEY=""

# Replace `LB_TWEETS_PIPE_KEY` with your API from the forked Video Tweets Extractor Pipe
LB_TWEETS_PIPE_KEY=""

# Replace `LB_RECOMMENDATION_PIPE_KEY` with your API from the forked Video Recommendations Extractor Pipe
LB_RECOMMENDATION_PIPE_KEY=""

# Replace `LB_QUOTES_PIPE_KEY` with your API from the forked List Quotes from Video Pipe
LB_QUOTES_PIPE_KEY=""

Step #3Create Wisdom Extraction API Route

Create a new file app/api/langbase/wisdom/route.ts. This API route will call the Langbase AI Pipes to extract wisdom from the YouTube video.

First we define the GenerationType enum and the getEnvVar function to get the environment variable based on the type of the Pipe we want to call. We will specify the type of the Pipe in the request body from the UI. This function will return the environment variable based on the type.

// Enum for type.
enum GenerationType {
	Generate = 'generate',
	Summarize = 'summarize',
	Quotes = 'quotes',
	Recommendation = 'recommendation',
	MainIdeas = 'mainIdeas',
	Facts = 'facts',
	Wow = 'wow',
	Tweets = 'tweets'
}

// Get the environment variable based on type.
const getEnvVar = (type: GenerationType) => {
	switch (type) {
		case GenerationType.Generate:
			return process.env.LB_GENERATE_PIPE_KEY;
		case GenerationType.Summarize:
			return process.env.LB_SUMMARIZE_PIPE_KEY;
		case GenerationType.Quotes:
			return process.env.LB_QUOTES_PIPE_KEY;
		case GenerationType.Recommendation:
			return process.env.LB_RECOMMENDATION_PIPE_KEY;
		case GenerationType.MainIdeas:
			return process.env.LB_MAIN_IDEAS_PIPE_KEY;
		case GenerationType.Facts:
			return process.env.LB_FACTS_PIPE_KEY;
		case GenerationType.Wow:
			return process.env.LB_WOW_PIPE_KEY;
		case GenerationType.Tweets:
			return process.env.LB_TWEETS_PIPE_KEY;
		default:
			return null;
	}
};

We also define the RequestBody type and the requestBodySchema schema for the request body of the API route.

import z from 'zod'; // For schema validation

// Schema for request body
const requestBodySchema = z.object({
	prompt: z.string(),
	transcript: z.string().trim().min(1),
	type: z.enum([
		GenerationType.Generate,
		GenerationType.Summarize,
		GenerationType.Quotes,
		GenerationType.Recommendation,
		GenerationType.MainIdeas,
		GenerationType.Facts,
		GenerationType.Wow,
		GenerationType.Tweets
	])
});

// Type for request body
type RequestBody = z.infer<typeof requestBodySchema>;

Add the route code to the app/api/langbase/wisdom/route.ts file:

/**
 * This API route calls the Langbase AI Pipes to extract wisdom from the YouTube video.
 *
 * @param {NextRequest} req - The request object.
 * @returns {Response} The response object streaming the final response back to the frontend.
 */
export async function POST(req: NextRequest) {
	try {
		// Extract the prompt from the request body
		const reqBody: RequestBody = await req.json();
		const parsedReqBody = requestBodySchema.safeParse(reqBody);

		// If the request body is not valid
		if (!parsedReqBody.success || !parsedReqBody.data) {
			throw new Error(parsedReqBody.error.message);
		}

		// Extract the prompt from the request body
		const { prompt, transcript, type } = parsedReqBody.data;

		// Get the environment variable based on type.
		const pipeKey = getEnvVar(type);

		// If the Pipe API key is not found, throw an error.
		if (!pipeKey) {
			throw new Error('Pipe API key not found');
		}

		// Generate the response and stream from Langbase Pipe.
		return await generateResponse({ prompt, transcript, pipeKey });
	} catch (error: any) {
		return new Response(error.message, { status: 500 });
	}
}

/**
 * Generates a response by initiating a Pipe, constructing the input for the stream,
 * generating a stream by asking a question, and returning the stream in a readable stream format.
 * @param {Object} options - The options for generating the response.
 * @param {string} options.transcript - The transcript to be used as user input or variable value.
 * @param {string} options.prompt - The prompt to be used as user input or variable value.
 * @param {string} options.pipeKey - The API key for the Pipe.
 * @returns {Response} The response stream in a readable stream format.
 */
async function generateResponse({
	transcript,
	prompt,
	pipeKey
}: {
	transcript: string;
	prompt: string;
	pipeKey: string;
}) {
	// 1. Initiate the Pipe.
	const pipe = new Pipe({
		apiKey: pipeKey
	});

	// 2. Construct the input for the stream
	// 2a. If we have prompt, we pass 'transcript' as a variable.
	// This is useful when we want to use the transcript as a variable in the prompt.
	// Used with the question answers Pipe.
	// 2b. Otherwise we pass 'transcript' as user input.
	let streamInput: StreamOptions;
	if (!prompt) {
		streamInput = {
			messages: [{ role: 'user', content: transcript }]
		};
	} else {
		streamInput = {
			messages: [{ role: 'user', content: prompt }],
			variables: { transcript: transcript }
		};
	}

	// 2. Generate a stream by asking a question
	const { stream } = await pipe.streamText(streamInput);

	// 3. Done, return the stream in a readable stream format.
	return new Response(stream.toReadableStream());
}

Here is a quick explanation of what's happening in the code above:

We extract the prompt, transcript, and type from the request body.
We get the environment variable based on the type of the Pipe we want to call.
We initiate the Pipe with the API key using the Langbase SDK.
We construct the input for the stream. If we have a prompt, we pass the transcript as a variable. Otherwise, we pass the transcript as user input.
We generate a stream by asking a question using the Langbase SDK.
We return the stream in a readable stream format.

That's it! You have successfully created an AI Video Wisdom Extraction Tool using the Langbase SDK. You can connect the API routes to the frontend and start extracting wisdom from YouTube videos.

Complete code

You can find the complete code for the VideWisdom app in the GitHub repository.

Live demo

You can try out the live demo of the VideoWisdom here.

Guides