How does text-to-image generation work?

Offside MobileDecember 17, 2022

How does text-to-image generation work?

A powerful new form of artificial intelligence has burst onto the scene and captured the public’s imagination in recent months: text-to-image AI.

Text-to-image AI models generate original images based solely on written inputs. Users can input any text prompt they like—say, “a cute corgi lives in a house made out of sushi”—and, as if by magic, the AI will produce a corresponding image. (See above for this example; scroll down for some more.)

These models produce images that have never existed in the world or in anyone’s imagination. They are not simple manipulations of existing images on the Internet; they are novel creations, breathtaking in their originality and sophistication.

Read this also: Denon Home 150: Price and release date

The most well-known text-to-image model is OpenAI’s DALL-E. OpenAI debuted the original DALL-E model in January 2021. DALL-E 2, its successor, was announced in April 2022. DALL-E 2 has attracted widespread public attention, catapulting text-to-image technology into the mainstream.

Check this also: Gaia AI Rises $3 Million For Science To The Art Of Forestry

In the wake of the excitement around DALL-E 2, it hasn’t taken long for competitors to emerge. Within weeks, a lightweight open-source version dubbed “DALL-E Mini” went viral. Unaffiliated with OpenAI or DALL-E, DALL-E Mini has since been rebranded as Crayon following pressure from OpenAI.

In May, Google published its text-to-image model, named Imagen. (All the images included in this article come from Imagen.)

Report content on this page

How does text-to-image generation work?

Report Page