14 AI models that can generate images. Includes dedicated image generators and multimodal text models with native image creation capabilities from 9 providers.
Purpose-built models for creating images from text prompts
| # | Model |
|---|---|
| 1 | Nano Banana 2 (Gemini 3.1 Flash Image Preview)Google |
| 2 | Nano Banana Pro (Gemini 3 Pro Image Preview)Google |
| 3 | GPT-5 Image MiniOpenAI |
| 4 | GPT-5 ImageOpenAI |
| 5 | Nano Banana (Gemini 2.5 Flash Image)Google |
| 6 | Midjourney v6.1Midjourney |
| 7 | DALL-E 3OpenAI |
| 8 | Stable Diffusion 3.5Stability AI |
| 9 | FLUX.1 ProBlack Forest Labs |
| 10 | Ideogram 2.0Ideogram |
| 11 | Recraft V3Recraft |
| 12 | Imagen 3Google |
| 13 | Adobe Firefly 3Adobe |
| 14 | Leonardo PhoenixLeonardo AI |
Multimodal models like GPT-4o generate images alongside text conversations. Dedicated models like DALL-E and Stable Diffusion specialize in image quality and control.
Marketing materials, product mockups, social media content, concept art, UI/UX prototyping, and creative illustration - all possible with modern image generation AI.
Many models support inpainting, outpainting, and style transfer. Vision-capable multimodal models can also analyze and modify existing images based on text instructions.
Open-source models like Stable Diffusion and FLUX can be run locally for unlimited generation at zero per-image cost, with full control over outputs and fine-tuning.
Top image generation models include DALL-E 3, Stable Diffusion XL, Flux, and Midjourney. Some multimodal models like GPT-4o can also generate images alongside text responses.
Image generation is typically priced per image rather than per token. Prices range from $0.01 to $0.12 per image depending on resolution and model quality. Some open-source models can be run locally for free.
It depends on your needs. For photorealism, Flux and DALL-E 3 lead. For artistic styles, Midjourney excels. For customization and fine-tuning, Stable Diffusion offers the most flexibility. Check our image generation leaderboard for rankings.