Google’s new AI tool Whisk uses images as prompts

Google has yet another AI tool to add to the pile. Whisk is a Google Labs image generator that lets you use an existing image as your prompt. But its output only captures your starter image’s “essence” rather than recreating it with new details. So, it’s better for brainstorming and rapid-fire visualizations than edits of the source image.

The company describes Whisk as “a new type of creative tool.” The input screen starts with a bare-bones interface with inputs for style and subject. This simple introductory interface only lets you choose from three predefined styles: sticker, enamel pin and plushie. I suspect Google found those three allowed for the kind of rough-outline outputs the experimental tool is most ideal for in its current form.

As you can see in the image above, it produced a solid image of a Wilford Brimley plushie. (Google’s terms forbid pictures of celebrities, but Wilford slipped through the gates, Quaker Oats in tow, without alerting the guards.)

Whisk also includes a more advanced editor (found by clicking “Start from scratch” from the main screen). In this mode, you can use text or a source image in three categories: subject, scene and style. There’s also an input bar to add more text for finishing touches. However, in its current form, the advanced controls didn’t produce results that looked anything like my queries.

For example, check out my attempt to generate the late Mr. Brimley in a lightbox scene in the style of a walrus plushie image I found online:

Screenshot of an AI generation tool producing images a man who looks a bit like Wilford Brimley.
Google / Screenshot by Will Shanklin for Engadget

Whisk spit out what looks like a vaguely Wilford Brimley-esque actor eating oatmeal inside a lightbox frame. As far as I can tell, that dude is not a plushie. So, it’s clear why Google recommends using the tool more for “rapid visual exploration” and less for production-ready content.

Google acknowledges that Whisk will only draw from “a few key characteristics” of your source image. “For example, the generated subject might have a different height, weight, hairstyle or skin tone,” the company warns.

To understand why, look no further than Google’s description of how Whisk works under the hood. It uses the Gemini language model to write a detailed caption of the source image you upload. It then feeds that description into the Imagen 3 image generator. So, the result is an image based on Gemini’s words about your image — not the source image itself.

Whisk is only available in the US, at least for now. You can try it at the project’s Google Labs site.

This article originally appeared on Engadget at https://www.engadget.com/ai/googles-new-ai-tool-whisk-uses-images-as-prompts-210105371.html?src=rss https://www.engadget.com/ai/googles-new-ai-tool-whisk-uses-images-as-prompts-210105371.html?src=rss
Creato 1mo | 16 dic 2024, 22:10:17


Accedi per aggiungere un commento

Altri post in questo gruppo

Everything we're still dying to know about the Nintendo Switch 2

The seven-year wait is finally over. Nintendo

17 gen 2025, 21:30:14 | Engadget
The Nintendo Switch 2 has been revealed, here's everything we know so far

As the world turns, so do the console generations. The Nintendo Switch is over seven years old

17 gen 2025, 21:30:13 | Engadget
Greta Gerwig's Narnia movie will get up to four weeks in theaters

Greta Gerwig's follow-up to Barbie, an adaptation of one of C.S Lewis' Chronicles of Narnia books, will be exclusively viewable in theaters for four weeks before it hits Netflix on Christm

17 gen 2025, 21:30:12 | Engadget
Charge Cars rescued by private investors, so bring on that electric ‘67 Mustang replica

UK EV startup Charge Cars has been acquired by a group of private investors. The company will continue development of the ‘67, which is an all-electric replica of the

17 gen 2025, 19:20:16 | Engadget
Prime members can now get $50 off the Kindle Colorsoft

Amazon expanded its ereader lineup a few months back with the Kindle Colorsoft, its first model with a color display. However, at $280, it was certainly on the pricier end — especially compared wit

17 gen 2025, 19:20:15 | Engadget
Samsung Galaxy S25 Unpacked 2025 event: What to expect on January 22

Samsung’s first big launch of 2025 is right around the corner.

17 gen 2025, 19:20:14 | Engadget