The new Google Labs experiment is Whiskwhich allows images to be generated by AI, but allowing images to be used instead of text as a starting point. A turn that can democratize synthetic creation a little more.
Why is it important. Visual AI is dominating the conversation, especially with improvements in image generators and the gradual arrival of video generators. right here Google arrives simplifying the art of prompting.
behind the scenes. Whisk uses two AI engines. First, Gemini translates the images that the user uploads into very detailed descriptions. Secondly, Image 3 transforms them into new creations combining subject, background, style…
It is not that Whisk wants to exactly replicate the original image, rather it seeks to capture its essence and inspire these new creations with it.
In detail. The process is simple:
- Drag images into Whisk to define subject, scene and style.
- AI generates variations.
- It is now possible to refine the results using text instructions.
At the moment it is only available in the United States, and according to Google, it generates results in “seconds.” It also allows you to use several images as a reference and part of three predefined styles: stickershiny pin and plush.
Go deeper. Whisk does not seek to be a typical image editor, its strength lies in quickly generating visual ideas. It goes more for the conceptualization and the first tests than for the editing or the final designs.
Above all, it is perfect for quickly iterating concepts from a first creation inspired by other works. Of course, it is far from perfection and Google itself recognizes limitations, starting with the disparity between initial expectation and final result.
In techopiniones | Practical guide to writing the best prompts in Midjourney and creating amazing images
Featured image | techopiniones with Whisk