DeepSeek has called into question Big AI’s trillion-dollar assumption

Recently, Chinese startup DeepSeek created state-of-the art AI models using far less computing power and capital than anyone thought possible. It then showed its work in published research papers and by allowing its models to explain the reasoning process that led to this answer or that. It also scored at or near the top in a range of benchmark tests, besting OpenAI models in several skill areas. The surprising work seems to have let some of the air out of the AI industry’s main assumption—that the best way to make models smarter is by giving them more computing power, so that the AI lab with the most Nvidia chips will have the best models and shortest route to artificial general intelligence (AGI—which refers to AI that’s better than humans at most tasks). 

No wonder some Nvidia investors are questioning their faith in the unlimited demand for the most powerful AI chips in the future. And no wonder some in AI circles are questioning the world view and business strategy of OpenAI CEO Sam Altman, the biggest evangelist for the “brute force” approach to ever-smarter models. 

“The assumption behind all this investment is theoretical . . . the so-called scaling laws where when you double compute, the quality of your models increases in kind of the same way—it’s kind of a new Moore’s Law,” says Abhishek Nagaraj, a professor at the University of California–Berkeley’s Haas business school. (Moore’s Law said that software developers could expect microchips to become predictably more powerful as chipmakers packed more transistors into their microchips.) 

“And so if that holds, it effectively means that whoever controls the infrastructure will control a lot of the market,” adds Nagaraj. That’s why companies like OpenAI, Anthropic, and X are building data centers as fast as they can. OpenAI CEO Sam Altman last year said he needs to raise $7 trillion to build the data centers needed to reach AGI. OpenAI, Microsoft, Softbank, and Oracle said recently they’ll spend up to $500 billion over the next five years to build new data centers for AI in Texas. 

Attracting the money to do that, however, is something only “closed-source” companies like OpenAI can do, Nagaraj points out. OpenAI’s private equity backers (such as Andreessen Horowitz) and big tech backers (such as Microsoft) are willing to bankroll the AI infrastructure (chips, software, data centers, electricity), which OpenAI says it needs, if it keeps the recipes of its models secret. That’s the “moat” around their investment, after all. Establishing such a moat was the main reason OpenAI stopped being an “open” AI company back in 2019. 

DeepSeek shares the weights of its models (the mathematical calculations at each connection point in their neural networks) and allows any developer to build with them. After essentially giving away its research and eschewing a moat, DeepSeek was never going to attract the private equity funding needed to bankroll hundreds of thousands of Nvidia chips. Adding to its challenge were the U.S. chip bans that reserved the most powerful AI chips for U.S. companies. So DeepSeek found ways to build state-of-the-art models using far less computing power. In doing so, it appears to have collapsed Altman’s assumption that massive computing power is the only route to AGI.

Not everybody thinks so, of course. Particularly in OpenAI circles. “I would never bet against compute as the upper bound for achievable intelligence in the long run,” says Andrej Karpathy, one of the original founders of OpenAI, in an X post. “Not just for an individual final training run, but also for the entire innovation/experimentation engine that silently underlies all the algorithmic innovations.”

Altman, too, seemed undeterred. “We will obviously deliver much better models and also it’s legit invigorating to have a new competitor! We will pull up some releases . . . ,” he posted breezily on X. “But mostly we are excited to continue to execute on our research roadmap and believe more compute is more important now than ever before to succeed at our mission.” OpenAI’s “mission” is AGI. 

Lots of powerful chips will be needed, if only because the general demand for AI services is going to grow exponentially. More data centers will be needed just to respond to calls from millions of AI-infused apps built on OpenAI APIs, he added

Some have suggested that DeepSeek’s discovery of ways to build more compute-efficient advanced AI models could reduce the barrier to entry and allow far more developers to build such models of their own, therefore pushing up demand for AI chips.

For example, DeepSeek’s most recent model, DeepSeek-R1, provided the open-source world with a reasoning model that appears to be comparable to OpenAI’s state-of-the-art o1 series, which applies more computing power at inference time, when the model is reasoning through various routes to a good answer. In a statement Monday, Nvidia gives DeepSeek props for creating reasoning models using “widely available” Nvidia GPUs, and adds that such models require “significant numbers” of the GPUs as well as fast chip-to-chip networking technology. 

The latest DeepSeek models have only been available to developers for a short time. Just like when Meta introduced its open-source Llama models, it will take some time to understand the real economics of building new models and apps based on the DeepSeek models. It’s possible that more widely distributing the ability to build cutting edge models could put more brains to work on finding novel routes to AGI and, later, superintelligence. That’s the good news. The bad news may be that powerful models, and the means to build them, will become more available to people who might use them maliciously, or who may not be fastidious about using accepted safety guardrails. 

But DeepSeek is not perfect. The DeepSeek chatbot has in anecdotal cases emphatically misidentified itself as the creation of OpenAI or Microsoft. Nor can the chatbot speak freely on all subjects. “Like all Chinese AI companies, DeepSeek operates within the People’s Republic of China’s regulatory framework, which includes restrictions on how language models handle politically sensitive topics,” says David Bader, a professor at the New Jersey Institute of Technology. “These constraints are evident in how their models respond to queries about historical events and government policies.” If you ask the chatbot about the Tiananmen Square protests, for example, it responds with, “Let’s talk about something else.”  

https://www.fastcompany.com/91268664/deepseek-called-into-question-big-ai-trillion-dollar-assumption-openai?partner=rss&utm_source=rss&utm_medium=feed&utm_campaign=rss+fastcompany&utm_content=rss

Établi 1d | 29 janv. 2025 à 02:10:04


Connectez-vous pour ajouter un commentaire

Autres messages de ce groupe

How execs can bridge the AI knowledge gap

From streamlining administrative tasks to enhancing brainstorming sessions, AI is becoming an essential workplace companion. Yet, despite its transformative promise, its integration isn’t as

30 janv. 2025 à 01:20:05 | Fast company - tech
‘What’s more motivating than a punch card?’ TikTok has a new hack for keeping New Year resolutions

“What’s more motivating than a punch card?” That’s the simple idea behind a recent so-called “punch party” that crea

29 janv. 2025 à 22:50:09 | Fast company - tech
This group is playing ‘Dungeons & Dragons’ to help L.A. fire victims, and you can join in

The devastating California wildfires have led to a number of benefit events, from concerts to comedy shows, with the intention to fundraise for wildfire recovery efforts. 

The team

29 janv. 2025 à 22:50:08 | Fast company - tech
Amazon secretly tracked Californian consumers via cellphones, lawsuit alleges

Amazon.com was sued on Wednesday by consumers who accused the retailing giant of secretly tracking their movements through their cellphones

29 janv. 2025 à 22:50:07 | Fast company - tech
Alibaba rolls out AI model, claiming it’s better than DeepSeek-V3

Chinese tech company Alibaba on Wednesday released a new version of its Qwen 2.5 artificial intelligence model that it claimed surpassed t

29 janv. 2025 à 20:40:03 | Fast company - tech
The rise of ‘influencer voice’: Why this TikTok creator accent is taking over the internet and maybe the world

The “influencer accent” is taking over TikTok. If you don’t know what I’m talking about, scroll through your FYP page and listen. 

British singer-songwriter Cassyette pointed out th

29 janv. 2025 à 16:10:03 | Fast company - tech
AI assistants for lawyers are a booming business—with big risks

Illinois lawyer Mathew Kerbis markets himself as the Subscription Attorney, charging businesses and individual clients a monthly rate for legal

29 janv. 2025 à 13:40:07 | Fast company - tech