CMOtech UK - Technology news for CMOs & marketing decision-makers
Detailed illustration server room computer racks glowing data flow lines

Google launches Gemini 2.5 models with pricing & speed updates

Today

Google has released updates to its Gemini 2.5 suite of artificial intelligence models, detailing stable releases, new offerings, and pricing changes.

Model releases

The company announced that Gemini 2.5 Pro and Gemini 2.5 Flash are now generally available and deemed stable, maintaining the same versions that had previously been available for preview. In addition, Google introduced Gemini 2.5 Flash-Lite in preview, providing an option focused on cost-effectiveness and latency within the Gemini 2.5 product line.

Gemini 2.5 models are described as "thinking models" capable of reasoning through their processes before generating responses, a feature that is expected to enhance the performance and accuracy of the tools. The models allow developers to manage a so-called "thinking budget", granting greater control over the depth and speed of reasoning based on the needs of individual applications.

Gemini 2.5 Flash-Lite

Gemini 2.5 Flash-Lite is intended as an upgrade for customers currently using previous iterations such as Gemini 1.5 and 2.0 Flash models. According to the company, the new model improves performance across several evaluation measures, reduces the time to first token, and increases decoding speed in terms of tokens per second. Flash-Lite is targeted at high-volume use cases like classification and summarisation at scale, where throughput and cost are key considerations.

This model provides API-level control for dynamic management of the "thinking budget." It is set apart from other Gemini 2.5 models in that its "thinking" function is deactivated by default, reflecting its focus on cost and speed. Gemini 2.5 Flash-Lite includes existing features such as grounding with Google Search, code execution, URL context, and support for function calling.

Updates and pricing

Google also clarified changes to the Gemini 2.5 Flash model and its associated pricing structure. The pricing for 2.5 Flash has been updated to USD $0.30 per 1 million input tokens (increased from USD $0.15) and USD $2.50 per 1 million output tokens (reduced from USD $3.50). The company removed the distinction between "thinking" and "non-thinking" pricing and established a single price tier, irrespective of input token size.

In a joint statement, Shrestha Basu Mallick, Group Product Manager, and Logan Kilpatrick, Group Product Manager, said: "While we strive to maintain consistent pricing between preview and stable releases to minimize disruption, this is a specific adjustment reflecting Flash's exceptional value, still offering the best cost-per-intelligence available. And with Gemini 2.5 Flash-Lite, we now have an even lower cost option (with or without thinking) for cost and latency sensitive use cases that require less model intelligence."

Customers using Gemini 2.5 Flash Preview from April will retain their existing pricing until the model's planned deprecation on July 15, 2025, after which they will be required to transition to the updated stable version or move to Flash-Lite Preview.

Continued growth for Gemini 2.5 Pro

Google reported that demand for Gemini 2.5 Pro is "the steepest of any of our models we have ever seen." The stable release of the 06-05 version is intended to increase capacity for customers using Gemini 2.5 Pro in production environments, maintaining the existing price point. The company indicated that the model is particularly well-suited for tasks requiring significant intelligence and advanced capabilities, such as coding and agentic tasks, and noted its adoption in a range of developer tools.

"We expect that cases where you need the highest intelligence and most capabilities are where you will see Pro shine, like coding and agentic tasks. Gemini 2.5 Pro is at the heart of many of the most loved developer tools."

Google highlighted a range of tools built on Gemini 2.5 Pro, including offerings from Cursor, Bolt, Cline, Cognition, Windsurf, GitHub, Lovable, Replit, and Zed Industries.

The company advised that users of the 2.5 Pro Preview 05-06 model will be able to access it until June 19, 2025, when it will be discontinued. Those using the 06-05 preview version are directed to update to the now-stable "gemini-2.5-pro" model.

The statement concluded: "We can't wait to see even more domains benefit from the intelligence of 2.5 Pro and look forward to sharing more about scaling beyond Pro in the near future."

Follow us on:
Follow us on LinkedIn Follow us on X
Share on:
Share on LinkedIn Share on X