Google launches Gemini 2.0 AI models, and showcases their powers in new agents

Google on Wednesday gave the public and developers a taste of the second generation of its Gemini frontier models, and a preview of some of the agents it will power. 

The new Gemini 2.0 family of models is designed to power new AI agents that understand more than just text, and reason and complete tasks with more autonomy. Google described how the new models will improve an experimental agent called Project Astra, which lets AI process information seen through a camera. It previewed another experimental agent, now called Project Mariner that’s designed to perform web tasks on behalf of the user.

“[O]ur next era of models [is] built for this new agentic era,” said Google CEO Sundar Pichai in a blog post Wednesday. “With new advances in multimodality–like native image and audio output–and native tool use, it will enable us to build new AI agents that bring us closer to our vision of a universal assistant.” The term “universal assistant” implies an AI agent with artificial general intelligence (AGI), or the ability to do most tasks as well or better than humans. Experts say the industry is anywhere from two to 10 years away from realising that aspiration.

Google isn’t yet unveiling the largest and most capable of its Gemini 2.0 models. That may come in another announcement in January. For now it’s releasing to developers an experimental version of a smaller and faster variant called Gemini Flash 2.0. “It’s our workhorse model with low latency and enhanced performance at the cutting edge of our technology, at scale, Google Deepmind CEO Demis Hassabis says in a blog post

Gemini 2.0 Flash, Hasabis says, is twice as fast as its predecessor model, 1.5 Flash, and significantly smarter. He says the new model is multimodal, meaning it can process and output text, images, and audio. The “experimental” version supports multimodal input but only text output. Flash 2.0 is also capable of calling on external tools like Google search, or tools made by other companies, as well as execute computer code. 

Consumer users can get in on the fun, too. Gemini chatbot users can now choose to have the chatbot powered by the Flash 2.0 (experimental) model. Google says it’ll put Gemini 2.0 models under the hood of more of its apps and services next year.

Gemini’s second generation is focused on powering AI agents capable of taking steps on their own and calling on resources they need. The models can take a very large set of instructions and (multimodal) file inputs from the user, then use planning, reasoning, and function-calling (such as conducting a web search) to produce an answer. 

The wider skill set is showcased in a couple of experimental agents, one for a mobile device and one for a web browser.

At the company’s developer event earlier this year Google demonstrated a multimodal agent called Project Astra that can react and reason on real-time video seen through a phone camera, as well as audio (including language) it hears through the device’s microphones. Gemini 2.0, Google says, will give the agent better conversational skills and the ability to call on Google Search and Maps. Astra is nowhere near being released to the public, however.

Gemini 2.0 will enable another experiment called Project Mariner, an agent that understands the images, text, code, and other elements within a browser window, then performs tasks based on that input via a Chrome browser extension. Google says the agent, which is available only to a group of “trusted testers,” is often slow and inaccurate today, but will improve rapidly. 

“If Gemini 1.0 was about organizing and understanding information,” Pichai said in his blog post, “Gemini 2.0 is about making it much more useful.”

https://www.fastcompany.com/91245005/google-launches-gemini-2-0-ai-models-and-showcases-their-powers-in-new-agents?partner=rss&utm_source=rss&utm_medium=feed&utm_campaign=rss+fastcompany&utm_content=rss

Creato 4mo | 11 dic 2024, 16:30:05


Accedi per aggiungere un commento

Altri post in questo gruppo

Tinder wants you to flirt with an AI bot before you flop with a human

Think you’ve got game? Time to put it to the test with Tinder’s latest launch in collaboration with OpenAI.

On Tuesday, Tinder rolled out The Game Game—a new experience designed to help

1 apr 2025, 21:20:06 | Fast company - tech
‘Imagine having Cybertruck money and buying a Cybertruck’: TikTok is full of people trading in their Teslas to the sounds of Taylor Swift

The old Tesla can’t come to the phone right now. Why? Oh, ‘cause she’s dead.

Over the past few days, a new trend has emerged on TikTok: people are posting their Tesla trade-ins accompani

1 apr 2025, 19:10:03 | Fast company - tech
Kickstarter isn’t just for indie passion projects anymore

Despite a ">triumphant world premiere at Cannes last May, the politically unsparing Donald Trump biopic The Apprentice was stuck in

1 apr 2025, 16:40:05 | Fast company - tech
‘inZOI’ challenges ‘The Sims’ with a fresh take on life simulation

Countless hours, days—perhaps even weeks—of my life have been spent creating Sims characters, building them houses, marrying them off, and making babies. Now, there’s a new life-simulatio

1 apr 2025, 14:20:11 | Fast company - tech
SpaceX flight launches 4 space tourists into first-ever polar orbit

A bitcoin investor who bought a SpaceX flight for himself and three polar explorers blasted

1 apr 2025, 14:20:10 | Fast company - tech
AI researchers want to map the 3D world. That means going vertical—and possibly nuclear

Spatial intelligence is an emerging approach to deploying AI in the physical world. By combining mapping data with artificial intelligence, it aims to deliver “smart data” tied to specific locatio

1 apr 2025, 12:10:05 | Fast company - tech
3 years into war with Russia, this Ukrainian startup is powering a drone revolution

Ukraine’s war with Russia—sparked by Russia’s invasion in the spring of 2022—is now entering its fourth year. So t

1 apr 2025, 12:10:04 | Fast company - tech