OpenAI’s new o1 models push AI to PhD-level intelligence

OpenAI introduced on Thursday OpenAI o1, a new series of large language models the company says are designed for solving difficult problems and working though complex tasks.

The models were trained to take longer to perform tasks than other AI models, thinking through problems in ways a human might. They can “refine their thinking process, try different strategies, and recognize their mistakes, OpenAI says in a press release. The models perform similarly to PhD students when working on physics, chemistry, and biology problems. 

The o1 models scored 83% on a qualifying exam for the International Mathematics Olympiad, OpenAI says, while its earlier GPT-4o model correctly solved only 13% of problems.

OpenAI provided some specific use case examples. The o1 models could be used by healthcare researchers to annotate cell sequencing data, by physicists to generate complicated mathematical formulas needed for quantum optics, and by developers to build and execute multi-step workflows. They also perform well in math and coding. 

Within OpenAI the o1 models were first codenamed “Q*” (pronounced “Q-star”), then “Strawberry.”

OpenAI says it’s taking a slow and cautious approach to releasing the new models. It’s releasing a couple of “early previews” of two of the models in the series. People with ChatGPT Plus or Teams accounts can access “o1-preview” by choosing it in a drop down menu within the chatbot. They can also choose “o1-mini,” which is faster and good at STEM questions, OpenAI says. 

Developers and researchers can access the models within ChatGPT and via an application programming interface. 

OpenAI says the new models won’t initially be able to access the internet. Users won’t be able to upload images or files to the models. OpenAI says it’s beefed up the safety features around the models, and has informed federal authorities about the more capable models.

https://www.fastcompany.com/91189817/openais-new-o1-models-push-ai-to-phd-level-intelligence?partner=rss&utm_source=rss&utm_medium=feed&utm_campaign=rss+fastcompany&utm_content=rss

Created 4mo | Sep 12, 2024, 8:30:04 PM


Login to add comment

Other posts in this group

Meta’s Threads is finally getting ads

Threads, Meta’s X and Bluesky rival, is testing ads with certain brands in the United States and Japan, the company said Friday.

“We know there will be plenty of feedback abo

Jan 24, 2025, 8:10:07 PM | Fast company - tech
How the broligarchy is imitating Trump in more ways than one

Sooner or later, the politicians who most admire Donald Trump begin to emulate him. They

Jan 24, 2025, 5:50:03 PM | Fast company - tech
We need to put human creativity at the center of adtech

I’ve been searching for the words to describe my feelings towards the current state of adtech. Terms like “stale,” “stagnant,” and “boring” are among the

Jan 24, 2025, 1:20:02 PM | Fast company - tech
How dangerous are 3D printers? Maybe enough for a background check

As 3D-printed gun violence abounds, some lawmakers are looking to cut the problem at the root. 

The New York state senate is currently evaluating a bill that would dramatically chan

Jan 24, 2025, 10:50:04 AM | Fast company - tech
A new Instagram feature might expose your embarrassing habits

Instagram Reels has added a new feature that shows you a feed of videos that your friends have liked. The bad news: It works both ways, meaning your friends can now see every video you’ve liked.&n

Jan 23, 2025, 9:10:04 PM | Fast company - tech
Subaru security vulnerability exposed millions of cars to tracking risks

Two security researchers discovered a security vulnerability in Subaru’s Starlink-connected vehicles last year that gave them “unrestricted targeted access to all vehicles and customer

Jan 23, 2025, 9:10:03 PM | Fast company - tech
OpenAI’s new Operator is a step into AI’s agentic future

OpenAI announced on Thursday a research preview of Operator, an AI agent that can browse the web and perform tasks for the user. Operat

Jan 23, 2025, 9:10:02 PM | Fast company - tech