Google launches Gemini 3.1 Flash-Lite, its “fastest and most cost-efficient” AI model

Google has launched Gemini 3.1 Flash-Lite, its “fastest and most cost-efficient Gemini 3 series model,” available in preview via AI Studio and Vertex AI. Priced lower than Gemini 3.1 Pro, it offers faster response times, flexible ‘thinking levels,...

ETtech
Google has introduced Gemini 3.1 Flash-Lite, which it says is its “fastest and most cost-efficient Gemini 3 series model.”

“Starting today, 3.1 Flash-Lite is rolling out in a preview to developers via the Gemini API in Google AI Studio and for enterprises via Vertex AI,” the company said in a blog post.

Priced at $0.25 per million input tokens and $1.50 per million output tokens, Flash-Lite is significantly cheaper than flagship models such as Gemini 3.1 Pro ($2.00 per million input tokens and $1.50 per million output tokens).


Google claims it “outperforms 2.5 Flash with a 2.5X faster Time to First Answer Token and 45% increase in output speed, according to the Artificial Analysis benchmark, while maintaining similar or better quality.”

What Gemini 3.1 Flash-Lite can do

The model comes with ‘thinking levels’ in AI Studio and Vertex AI, giving developers the ability to control how much the model “thinks” for each task — important for managing high-frequency workloads.
ADVERTISEMENT

“3.1 Flash-Lite can tackle tasks at scale, like high-volume translation and content moderation, where cost is a priority. And it can also handle more complex workloads where more in-depth reasoning is needed, like generating user interfaces and dashboards, creating simulations or following instructions,” the blog post said.

Early-access developers and companies, including Latitude, Cartwheel, and Whering, are already testing Flash-Lite for large-scale problem solving. Early testers highlighted 3.1 Flash-Lite’s efficiency and reasoning capabilities, saying it can “handle complex inputs with the precision of a larger-tier model, plus follow instructions and maintain adherence,” according to the blog post

Benchmarks and performance

Gemini 3.1 Flash-Lite got an Elo score of 1432 on the Arena.ai Leaderboard, outperforming other models in its tier for reasoning and multimodal understanding. It achieved 86.9% on GPQA Diamond and 76.8% on MMMU Pro, surpassing even larger Gemini models from previous generations, such as 2.5 Flash.
ADVERTISEMENT

The model combines speed, cost efficiency, and flexible reasoning, making it suitable for both high-volume routine tasks and more complex AI workloads.
Download
The Economic Times Business News App
for the Latest News in Business, Sensex, Stock Market Updates & More.
Download
The Economic Times News App
for Quarterly Results, Latest News in ITR, Business, Share Market, Live Sensex News & More.
READ MORE
ADVERTISEMENT

READ MORE:

LOGIN & CLAIM

50 TIMESPOINTS

More from our Partners

Loading next story
Business News › Tech › AI › Google launches Gemini 3.1 Flash-Lite, its “fastest and most cost-efficient” AI model
Text Size:AAA
Success
This article has been saved

*

+