How OpenAI’s Jerry Tworek found a new way forward for large language models

The impressive intelligence gains in OpenAI’s models over time have mainly come from training them with progressively more training data, for longer amounts of time, and with massive computing power. But in 2024 new training data has become scarce and it’s become very expensive to further scale up computing power, so AI labs have sought new ways to continue pushing models toward artificial general intelligence (AGI), or AI that’s generally smarter than human beings. 

“I think that the scaling hypothesis landscape is much more multidimensional and we can scale multiple different things,” says OpenAI researcher Jerry Tworek, whose research in recent years has focused on AI models that can “think” about different approaches to solving complex problems, rather than relying mostly on what they learned in their pre-training to generate an answer. 

Tworek led the effort at OpenAI to develop the first major model to prove that the new approach works—“o1.” At the end of August OpenAI’s “o1-preview” model rose to the top of the LiveBench leaderboard, which ranks the intelligence of large frontier models. The o1 model takes longer to return answers, because it’s designed to emphasize complex reasoning and accuracy. Access to the model also costs considerably more than OpenAI’s earlier models.

Large language models borrow their design and behaviors from the neurons in the human brain, but Tworek and his team hoped to put more inspiration from the human brain into the o1 models—in this case humans’ approach to problem solving. “What we managed to train our models to do is this very natural way of reasoning,” Tworek says. “It looks a little bit more human. It is the model trying things in a very fluid, intelligent fashion.”

The model, for example, might play out one problem solving strategy to see if it leads to a solution, and switch to another approach if it doesn’t. Or, if it tries a particular tactic or branch in its reasoning that doesn’t bear fruit, it might backtrack and try another way forward. 

“There’s that pondering and deliberation and a lot of exploration when solving a problem,” he says. “That’s something that the [earlier] models were probably doing a little bit, but not that much, before and we really tried to double down on that.”

Tworek’s contribution to the evolution of OpenAI’s models is considerable, and growing. He’s been at the company through the company’s most important years. He arrived almost six years ago after spending a few years developing quantitative investment strategies at a hedge fund in Amsterdam.  

“I joined OpenAI when it was still a nonprofit,” Tworek says. “It was a small research lab, like a few cool people in San Francisco.” He was struck, however, by the young company’s big ambitions. “I was living in Europe before and you don’t often meet people who will say ‘Oh, Jerry, we are going to build AGI, are you in or not?’”

And OpenAI had good reason to be ambitious. Tworek arrived just as the startup was finishing up GPT-2, the first model that showed that supersizing training data and computing power could yield surprising intelligence gains. The company’s goal of building AGI was beginning to seem possible.

Six years later, some AI researchers, including OpenAI mastermind Ilya Sutskever, say the “supersizing” approach isn’t yielding the intelligence returns it once did. That’s why o1’s new approach of scaling computing power at inference time is so important. It may open a new avenue that lets researchers maintain their momentum toward AGI.

This story is part of AI 20, our monthlong series of profiles spotlighting the most interesting technologists, entrepreneurs, corporate leaders, and creative thinkers shaping the world of artificial intelligence.

https://www.fastcompany.com/91246222/openai-researcher-jerry-tworek-human-brain-o1-models?partner=rss&utm_source=rss&utm_medium=feed&utm_campaign=rss+fastcompany&utm_content=rss

созданный 2mo | 19 дек. 2024 г., 12:20:05


Войдите, чтобы добавить комментарий

Другие сообщения в этой группе

Ai2’s Ali Farhadi advocates for open-source AI models. Here’s why

A year before Elon Musk helped start OpenAI in San Francisco, philanthropist and Microsoft cofounder Paul Allen already had established his own nonprofit

24 февр. 2025 г., 17:50:07 | Fast company - tech
How agentic AI will shape the future of business

In 2024, Amazon introduced its AI-powered HR ass

24 февр. 2025 г., 17:50:06 | Fast company - tech
How ‘lore’ became the internet’s favorite way to overshare

Lore isn’t just for games like The Elder Scrolls or films like The Lord of the Rings—online, it has evolved into something entirely new.

The Old English word made the s

24 февр. 2025 г., 13:20:04 | Fast company - tech
These LinkedIn comedians are leaning into the cringe for clout

Ben Sweeny, the salesman-turned-comedian behind that online persona Corporate Sween, says that bosses should waterboard their employees. 

“Some companies drown their employees with

24 февр. 2025 г., 10:50:08 | Fast company - tech
The best apps to find new books

This article is republished with permission from Wonder Tools, a newsletter that helps you discover the most useful sites and apps. 

24 февр. 2025 г., 06:20:05 | Fast company - tech
5 tips for mastering virtual communication

Andrew Brodsky is a management professor at McCombs School of Business at the University of Texas at Austin. He is also CEO of Ping Group and has received nume

23 февр. 2025 г., 11:50:03 | Fast company - tech
Apple’s hidden white noise feature may be just the productivity boost you need

As I write this, the most pleasing sound is washing over me—gentle waves ebbing and flowing onto the shore. Sadly, I’m not actually on some magnificent tropical beach. Instead, the sounds of the s

22 февр. 2025 г., 12:40:06 | Fast company - tech