In an age where AI is again the focus of the tech world, Google has come up with its text-ti-image AI generator that can provide you with images based on the text input. It’s the Imagen AI system, which is created by the Google Brain team, and if Google and the bunch of sample images are to be believed, it can generate “photorealistic images and deep level of language understanding.” Here’s a look at the details.

As the name suggests, the job isn’t difficult. All you need to do is type what you want to see and based on its understanding after reading loads of data, Imagen will generate an image for you.

The outputs appear quite accurate and give a tough competition to other text-to-image AI models like OpenAI’s popular DALL-E (which even has a successor), VQ-GAN+CLIP, and Latent Diffusion Models. Google even has proof. It has introduced a benchmark tool called DrawBench for this and its data perceive Imagen as the better one.

Google also reveals that on COCO, Imagen was able to achieve a COCO FID of 7.27 and human raters have found the results “on par with the reference images.”

But you should know that the sample images provided by such AI systems are often the ones that are deemed the best and the ones that go awry remain well under behind the curtains. So, to consider Google’s AI model the best can be too early.