Google DeepMind's new AI tech will generate soundtracks for videos

Google's DeepMind artificial intelligence laboratory is working on a new technology that can generate soundtracks, even dialogue, to go along with videos. The lab has shared its progress on the video-to-audio (V2A) technology project, which can be paired with Google Veo and other video creation tools like OpenAI's Sora. In its blog post, the DeepMind team explains that the system can understand raw pixels and combine that information with text prompts to create sound effects for what's happening onscreen. To note, the tool can also be used to make soundtracks for traditional footage, such as silent films and any other video without sound. 

DeepMind's researchers trained the technology on videos, audios and AI-generated annotations that contain detailed descriptions of sounds and dialogue transcripts. They said that by doing so, the technology learned to associate specific sounds with visual scenes. As TechCrunch notes, DeepMind's team isn't the first to release an AI tool that can generate sound effects — ElevenLabs released one recently, as well — and it won't be the last. "Our research stands out from existing video-to-audio solutions because it can understand raw pixels and adding a text prompt is optional," the team writes.

While the text prompt is optional, it can be used to shape and refine the final product so that it's as accurate and as realistic as possible. You can enter positive prompts to steer the output towards creating sounds you want, for instance, or negative prompts to steer it away from the sounds you don't want. In the sample below, the team used the prompt: "Cinematic, thriller, horror film, music, tension, ambience, footsteps on concrete.

The researchers admit that they're still trying to address their V2A technology's existing limitations, like the drop in the output's audio quality that can happen if there are distortions in the source video. They're also still working on improving lip synchronizations for generated dialogue. In addition, they vow to put the technology through "rigorous safety assessments and testing" before releasing it to the world. 

This article originally appeared on Engadget at https://www.engadget.com/google-deepminds-new-ai-tech-will-generate-soundtracks-for-videos-113100908.html?src=rss https://www.engadget.com/google-deepminds-new-ai-tech-will-generate-soundtracks-for-videos-113100908.html?src=rss
Erstellt 3mo | 18.06.2024, 13:10:10


Melden Sie sich an, um einen Kommentar hinzuzufügen

Andere Beiträge in dieser Gruppe

VR hit Walkabout Mini Golf is getting a mobile edition

Walkabout Mini Golf has been filled with players ever since it launched around the same time as the Meta Quest 2. Now the multiplayer mini-golf game is making the jump to iOS devices.

27.09.2024, 21:50:18 | Engadget
Valve's Deadlock lets you turn cheaters into frogs

Valve is continuing the wonderful tradition of messing with people who feel the need to cheat in multiplayer ga

27.09.2024, 19:40:15 | Engadget
Valve cuts binding arbitration from its Steam user agreement

If you booted up Steam in the last 24 hours, then you probably saw the pop up window asking you to agree to a new

27.09.2024, 19:40:14 | Engadget
Three men charged in connection with the Trump campaign hack

The US Department of Justice charged three Iranian nationals as part of an effort to hack into the emails and computers used by President Donald Trump’s campaign staff and other political connectio

27.09.2024, 19:40:12 | Engadget
Blizzard is trying to make a StarCraft shooter again (for the third time)

Blizzard is diving into the StarCraft shooter well once again, after two previous titles were canceled. This information comes from a

27.09.2024, 17:20:27 | Engadget