Google DeepMind's Genie 2 can generate interactive 3D worlds

World models — AI algorithms capable of generating a simulated environment in real-time — represent one of the more impressive applications of machine learning. In the last year, there’s been a lot of movement in the field, and to that end, Google DeepMind announced Genie 2 on Wednesday. Where its predecessor was limited to generating 2D worlds, the new model can create 3D ones and sustain them for significantly longer.

Genie 2 isn’t a game engine; instead, it’s a diffusion model that generates images as the player (either a human being or another AI agent) moves through the world the software is simulating. As it generates frames, Genie 2 can infer ideas about the environment, giving it the capability to model water, smoke and physics effects — though some of those interactions can be very gamey. The model is also not limited to rendering scenes from a third-person perspective, it can also handle first-person and isometric viewpoints. All it needs to start is a single image prompt, provided either by Google’s own Imagen 3 model or a picture of something from the real world.

Introducing Genie 2: our AI model that can create an endless variety of playable 3D worlds - all from a single image. 🖼️

These types of large-scale foundation world models could enable future agents to be trained and evaluated in an endless number of virtual environments. →… pic.twitter.com/qHCT6jqb1W

— Google DeepMind (@GoogleDeepMind) December 4, 2024

Notably, Genie 2 can remember parts of a simulated scene even after they leave the player’s field of view and can accurately reconstruct those elements once they become visible again. That’s in contrast to other world models like Oasis, which, at least in the version Decart showed to the public in October, had trouble remembering the layout of the Minecraft levels it was generating in real time.

However, there are even limitations to what Genie 2 can do in this regard. DeepMind says the model can generate “consistent” worlds for up to 60 seconds, with the majority of the examples the company shared on Wednesday running for significantly less time; in this case, most of the videos are about 10 to 20 seconds long. Moreover, artifacts are introduced and image quality softens the longer Genie 2 needs to maintain the illusion of a consistent world.

DeepMind didn’t detail how it trained Genie 2 other than to state it relied “on a large-scale video dataset.” Don’t expect DeepMind to release Genie 2 to the public anytime soon, either. For the moment, the company primarily sees the model as a tool for training and evaluating other AI agents, including its own SIMA algorithm, and something artists and designers could use to prototype and try out ideas rapidly. In the future, DeepMind suggests world models like Genie 2 are likely to play an important part on the road to artificial general intelligence.

“Training more general embodied agents has been traditionally bottlenecked by the availability of sufficiently rich and diverse training environments,” DeepMind said. “As we show, Genie 2 could enable future agents to be trained and evaluated in a limitless curriculum of novel worlds.”

This article originally appeared on Engadget at https://www.engadget.com/ai/google-deepminds-genie-2-can-generate-interactive-3d-worlds-200708207.html?src=rss https://www.engadget.com/ai/google-deepminds-genie-2-can-generate-interactive-3d-worlds-200708207.html?src=rss
Creato 1mo | 5 dic 2024, 21:50:24


Accedi per aggiungere un commento

Altri post in questo gruppo

TikTok is no longer available in the US

The switch has flipped on the US TikTok ban. TikTok's app stoped working and was removed from the App Store and Google Play on Saturday night, just hours before the January 19 ban was expected to t

19 gen 2025, 06:10:11 | Engadget
EV startup Canoo has filed for bankruptcy and stopped all operations

Canoo said on Friday night that it has

19 gen 2025, 01:30:09 | Engadget
Perplexity AI has reportedly submitted an 11th-hour bid to save TikTok in the US

Just one day before TikTok is expected to shut down in the US, startup Perplexity AI has submitted a bid to TikTok’s parent company ByteDance proposing a merger that would allow it to continue oper

18 gen 2025, 23:10:21 | Engadget
FTC orders Genshin Impact's developer to block young teens from making in-game purchases

Kids and younger teens might soon be unable to play Genshin Impact's

18 gen 2025, 23:10:20 | Engadget
Instagram swoops in with 3-minute Reels and rectangular profile grids as the TikTok ban gets real

Instagram is rolling out a bunch of changes this weekend that will conveniently make it look a lot more like TikTok, which

18 gen 2025, 20:51:06 | Engadget
Amazon puts its drone deliveries on hold following two crash incidents

Amazon's drones won't be making any deliveries in the foreseeable future. According to

18 gen 2025, 16:20:10 | Engadget