Why 'Thinking More' Isn't Always Making Generative AI Smarter


l: Professor Dinesh Manocha; r: Professor Amrit Bedi

If you've ever used an AI chatbot, you may have noticed something strange. Sometimes, when you ask it to 'try again' or 'think harder,' the answers don't get better, they just get weirder. Now, a team of researchers has discovered why: Generative AI systems can 'overthink' themselves into making mistakes.

A collaborative project between the UMD GAMMA Lab (Geometric Algorithms for Modeling, Motion, and Animation) and former GAMMA lab associates and external collaborators recently released a paper on reasoning in large language models. The main results were developed by Soumya Suvra Ghosal and Souradip Chakraborty, PhD students with the GAMMA group at UMD. This project was led by Professor Amrit Bedi from the University of Central Florida, a former research scientist at the Institute of Systems Research (ISR), and Distinguished University Professor Dinesh Manocha. The group also included Furong Huang from Computer Science and notable researchers from Amazon and Princeton University.

The focus of the paper is a subject that is currently relevant for reasoning models, such as OpenAI, Gemini Thinking, and DeepSeek, where sophisticated reasoning is a key part to generate responses for a given query. Titled Does Thinking More always Help? Understanding Test-Time Scaling in Reasoning Models, it addresses the question, “Does thinking more at test-time lead to better reasoning and improved performance”? This applies to a popular belief that extending the thinking process at test-time through using prompts such as “wait” or “let me rethink” should be able to improve the performance.

"We were intrigued by the popular notion that you could just tell a model to 'wait and think more' and it would magically get smarter," said Dr. Bedi. "Our findings show that it's not that simple. In fact, it can be counterproductive."

The Problem: Generative AI "Overthinking"

The researchers discovered that when you push current Generative AI systems to think longer about a single problem, its performance might get worse, not better. For a little while, the answers might improve, but later they become more random and less accurate. The Generative AI gets stuck in a loop of its own complex thoughts, a phenomenon the team calls “overthinking.” The researchers found that this extended thinking can increase the variability of the model's outputs, creating an illusion of improved reasoning while actually harming its accuracy. This interesting observation has been overlooked in many recent works.

A Possible Solution: Parallel Thinking or Exploration

As an alternative to current methods, the group proposed a solution called Parallel Thinking. It is an intuitive approach, motivated by human actions. Imagine a child solving a jigsaw puzzle. If the child tries to force one piece after another in a single, rigid order, they can easily get stuck on a path that doesn’t work. On the other hand, they could make three small piles of pieces: edges, corners, and center-pieces, and try to work on all three sections at once. They could quickly see which section starts to click and make progress there.

That's what the novel GenerativeAI method does: try several different approaches at once, and pick whichever is going well, rather than “overthinking” a single path until it derails. This approach of generating multiple independent reasoning paths has been shown to achieve up to 20% higher accuracy compared to simply extending the thinking time. The takeaway is clear: in the world of Generative AI, a chorus of diverse thoughts may be more powerful than a single, prolonged monologue.

Beyond Text: Generative AI for Images and Videos

This isn't just about chatbots. This discovery could make the GenerativeAI in your everyday life more reliable. Think of an app on your phone that identifies plants. An "overthinking" GenerativeAI might get obsessed with the pattern on one leaf and misidentify the entire plant. But with Parallel Thinking, the GenerativeAI could look at the leaf, the stem, and the flower all at once to make a much more reliable guess. This could lead to better and safer GenerativeAI everywhere, from the apps you use for online shopping to the systems that help self-driving cars understand the world around them.

July 9, 2025

«Previous Story

Current Headlines

Adjustable Drug Release Marks New Milestone in Ingestible Capsule Research

Why 'Thinking More' Isn't Always Making Generative AI Smarter

Sochol Named Interim Director of the Maryland Robotics Center

ISR Alumnus Earns Prestigious NSF CAREER Award

Celebrating a Legend: Matt Scassero's Retirement Event

MATRIX-Affiliated Faculty Solving Challenges From Sea to Space

Scientists Fast-Track Nerve-on-a-Chip Design via Machine Learning Algorithms

Sochol Receives E. Robert Kent Outstanding Teaching Award for Junior Faculty

Innovation and Collaboration: Congressional Leaders Visit Southern Maryland

ISR Honors 2025 Graduate Achievements