Google / Gemini-2.5-flash-lite - Langbase · Serverless AI Developer Platform

Gemini 2.5 Flash Lite excels at high-volume, latency-sensitive tasks like translation and classification. The model is best for high volume, cost efficient tasks.

Key Features

Optimized for Speed & Cost
Designed for high-volume, latency-sensitive tasks like translation, classification, and chat—with lower compute and faster responses.
Thinking Mode Support
Enables step-by-step reasoning using thinking budgets for better output quality when needed.
Improved Tool Use
Includes search and code execution tools—bringing it closer to agentic use cases.
Enhanced Reasoning & Coding
Outperforms 2.0 Flash-Lite across reasoning, math, science, and code benchmarks (e.g., SWE-bench, AIME, Aider Polyglot).
Multimodal Input Support
Handles text, image, video, audio, and now PDF inputs with up to 1M input tokens.
Large Output Window
Supports up to 64K output tokens—ideal for long responses and rich code generation.
Cost-Efficient Inference
Most affordable Gemini 2.5 variant, with additional savings from prompt caching and batch processing.
Latest Knowledge
Updated with a January 2025 knowledge cutoff, improving performance on current topics and tasks.
Available Everywhere You Build
Deployable via Google AI Studio, Gemini API, and Vertex AI.

Langbase

Model Card

Key Features

Meta data

Context

Prompt Cost

Completion Cost

Trained with data up to

⌘Langbase

Model Card

Key Features

Meta data

Context

Prompt Cost

Completion Cost

Trained with data up to

Langbase