CMOtech UK - Technology news for CMOs & marketing decision-makers
Robotic arms painting abstract colorful advertisement ai creative outcomes

Study finds AI creativity tools offer similar results in ads

Thu, 23rd Oct 2025

A new study has found that widely used artificial intelligence tools perform more similarly on creative advertising tasks than is widely believed.

The Creativity Benchmark study, conducted by Springboards in partnership with industry groups including the 4As, ACA, APG, D&AD, IAA, IPA, and The One Club for Creativity, compared the creative outputs of 16 different AI systems on real-world marketing challenges set by 100 notable brands.

According to the research, over 600 creative professionals from advertising agencies, marketing teams and strategy consultancies undertook more than 11,000 side-by-side comparisons of AI-generated ideas. These professionals assessed ideas for originality, insight and impact, but the findings indicated that the gap between the best- and worst-performing AI tools was smaller than anticipated.

Unexpected similarity

"Everyone assumes some AI tools are way better than others for creative work," said Pip Bingemann, CEO and co-founder of Springboards. "But our tests showed the results were pretty close. Why? Because these models are machines designed to recognise patterns and give you the most probable answer-and 'probable' has never been called 'creative.' Keeping humans in the loop and optimizing for a wider range of varied ideas is crucial."

The assessment looked at three main types of creative challenges: identifying unexpected insights about consumers, proposing large-scale campaign concepts, and generating bold, attention-catching ideas. The objective was to provide a practical measure of creativity as judged by professionals in the field.

Key findings

One main outcome was that no single AI system stood out as superior across all creative tasks. The research demonstrated that some tools were more effective at strategic, planning-oriented tasks, while others excelled at generating unconventional or wild ideas. This variety suggests agencies may benefit from selecting different tools for distinct assignments rather than relying on a single provider.

The study also found that the diversity of ideas produced by an AI system is an important consideration. Some models repeatedly suggested similar concepts, while others produced a wider range of responses for the same creative brief. Researchers concluded that the variety of outputs is as critical as the quality for effective idea generation.

Another notable result was the limited effectiveness of using AI to judge the creativity of its own outputs. AI systems did not agree with human experts when asked to evaluate creative ideas; differences in scoring indicated that agencies must apply human judgment when assessing concepts. Traditional creativity tests, commonly used in psychological research, were likewise found unreliable for predicting AI performance in advertising, as the requirements for marketing tasks differ significantly from academic creativity assessments.

Regional preferences emerged as an additional insight. Creative professionals in different countries tended to favour different AI systems, reflecting cultural differences in how creativity and originality are valued in advertising.

Industry response

"LLMs aren't a one-size-fits-all solution-they're general purpose tools that require human creativity to unlock breakthrough outcomes," said Jeremy Lockhorn, SVP, Creative Technologies & Innovation, 4As. "These findings suggest agencies and brands should continue to evaluate which models are best suited for creative work - and that a multi-model approach may well be the best path forward."

Industry leaders emphasised the continuing necessity of human involvement in the creative process, especially in interpreting and refining AI-generated material.

"This study highlights that creativity isn't about which AI you use, it's about how you use it," remarked Tony Hale, CEO, Advertising Council Australia. "The results reinforce what we see across the industry: the human spark remains essential to transforming good ideas into great ones. For agencies, the real opportunity is learning how to collaborate with these systems to expand, not replace, creative thinking."

Study methodology

The benchmark involved 678 advertising professionals of varying backgrounds, who participated in blind A/B idea reviews. The review process was designed to minimise bias, as participants did not know which AI had generated a given idea. The group made 11,012 head-to-head human comparisons during a four-week period in June and July 2025. Data analysis used statistical techniques including Bradley-Terry modelling and cosine distance to score the diversity of suggestions.

Four methods were used to test AI creativity: evaluation of AI responses by industry professionals; analysis of the diversity of ideas AI systems could produce; comparison of human and AI-based judgments of creativity; and application of psychology-based creativity tests adapted for AI.

All tests were conducted using standard settings across current AI platforms from OpenAI, Google, Anthropic, Meta, DeepSeek, and Alibaba, among others.

Follow us on:
Follow us on LinkedIn Follow us on X
Share on:
Share on LinkedIn Share on X