Anthropic, an artificial intelligence firm founded by former OpenAI employees, has unveiled Claude 3.7, an AI model that offers something new: the ability to produce either conventional output or a user-specified degree of “reasoning” to tackle demanding problems.
Anthropic claims that the new hybrid model, referred to as Claude 3.7, will allow users and developers to more easily address problems that require a blend of instinctive output and thoughtful, step-by-step analysis. “The [user] has a lot of control over the behavior—how long it thinks, and can trade reasoning and intelligence with time and budget,” explained Michael Gerstenhaber, product lead for the AI platform at Anthropic.
Claude 3.7 also includes a new “scratchpad” feature that offers insight into the model’s reasoning process. A similar feature proved popular with the Chinese AI model DeepSeek. This scratchpad allows users to understand how a model is approaching a problem, which can then be used to modify or refine prompts. According to Dianne Penn, research product lead at Anthropic, the scratchpad is particularly beneficial when coupled with the capacity to adjust a model’s “reasoning” capabilities. For example, if a model is having trouble breaking down a problem correctly, the user can direct it to spend more time on it.
Frontier AI companies are increasingly focused on equipping their models with improved ‘reasoning’ abilities as a method to enhance their overall capabilities and expand their potential applications. OpenAI, the company that initiated the current AI boom with ChatGPT, was the first to introduce a reasoning AI model, known as o1, in September 2024. OpenAI has since released a more powerful version, called o3. Meanwhile, Google, a competitor, has also launched a similar offering for its Gemini model, called Flash Thinking. A key difference with Claude 3.7 over its competitors is that users do not have to switch between models to access the reasoning abilities.
The distinction between a conventional model and a reasoning one mirrors the two modes of thinking described by Nobel laureate economist Michael Kahneman in his 2011 book, Thinking Fast and Slow: fast and instinctive System-1 thinking versus slower, more deliberative System-2 thinking.
The kind of model that enabled ChatGPT, known as a large language model or LLM, generates immediate responses to prompts by querying a large neural network. These outputs can be remarkably clever and coherent, but may be unable to answer questions that require step-by-step reasoning, including basic arithmetic. An LLM can be compelled to simulate deliberate reasoning by instructing it to create a plan that it must then follow. However, this trick is not always reliable, as models typically struggle to solve problems that demand comprehensive and careful planning.
OpenAI, Google, and now Anthropic are all using a machine learning method known as reinforcement learning to train their newest models to generate reasoning that guides output toward correct answers. This approach requires gathering additional training data from humans on solving specific problems. According to Penn, Claude’s reasoning mode was provided with additional data on business applications, including writing and fixing code, computer use, and complex legal question answering. “The things that we made improvements on are … technical subjects or subjects which require long reasoning,” Penn said. “What we have from our customers is a lot of interest in deploying our models into their actual workloads.”
Anthropic says that Claude 3.7 excels at solving coding problems that require step-by-step reasoning, outperforming OpenAI’s o1 on some benchmarks, such as SWE-bench. The company is releasing a new tool today, called Claude Code, specifically designed for AI-assisted coding.
“The model is already good at coding,” Penn explained. But “additional thinking would be good for cases that might require very complex planning—say you’re looking at an extremely large code base for a company.”