Get free access to Editor's Digest
FT editor Roula Khalaf selects her favourite stories for this weekly newsletter.
OpenAI will release an AI product that it claims is capable of reasoning, allowing it to solve complex problems in math, coding, and science, a major step toward achieving human-like cognition in machines.
The AI The models, known as o1, are touted as a sign of advances in technological capabilities over the past few years, as companies race to build increasingly sophisticated AI systems. In particular, there is a new scramble among tech groups including Google DeepMind, OpenAI and Anthropic to create software that can act independently as so-called agents — personalized bots that are supposed to help people work, create or communicate better and interact with the digital world.
In accordance with OpenAIThe models will be integrated into ChatGPT Plus starting Thursday. They are designed to be useful to scientists and developers, rather than casual users. The company said the o1 models significantly outperformed existing models like GPT-4o on the International Mathematical Olympiad qualifying exam, where they scored 83 percent compared to the latter's 13 percent.
Teaching computer software to reason step-by-step and plan ahead is a key step toward creating artificial general intelligence—machines with human-like cognitive abilities, according to experts in the field.
If AI systems were to demonstrate genuine reasoning, it would ensure “consistency between the facts, arguments, and conclusions made by AI, [and] “Advances in AI activity and autonomy are probably the biggest hurdle to achieving AI,” said Yoshua Bengio, a computer scientist at the University of Montreal and a winner of the prestigious Turing Award.
There has been steady progress in the field, with models like GPT, Gemini, and Claude showing some nascent reasoning abilities, Bengio said. However, the scientific consensus is that AI systems are not achieving true general-purpose reasoning.
“The correct way to assess achievements is through independent assessments by scientists and teachers without conflicts of interest,” he added.
Gary Marcus, professor of cognitive science at New York University and author Taming Silicon Valleywarned: “We see time and time again claims about reasoning that fall apart after careful and patient scrutiny by the scientific community, so I would treat any new claims with skepticism.”
Bengio also noted that software with more advanced capabilities poses a higher risk of misuse in the hands of bad actors. OpenAI said it had “strengthened” its security tests to match the advances, including giving independent AI safety institutes in the UK and US early access to a research version of the model.
Technologists believe that advances in this field will drive AI progress in the coming years.
Training models to solve problems has shown “significant” improvements in their abilities, according to Aidan Gomes, CEO of AI startup Cohere and one of the Google researchers who helped create the transformative technology behind chatbots like ChatGPT.
Speaking at a Financial Times event on Saturday, he said: “It's also significantly more expensive because you spend a lot of computation planning, thinking and reasoning before you actually give an answer. So the models become more expensive on that dimension, but significantly better at solving problems.”