AGEofLLMs.com
Search

OpenAI drops GPT‑4o’s New and Wild Image Generator

Calculating... Comments

GPT‑4o integrates new image creation right into its model, making AI-generated visuals more accurate and realistic.

  • It nails text rendering in images, something AI struggled with before.

  • It's better at understanding detailed prompts and creating images that match what you ask for.

  • Users can upload images as inspiration or get completely original visuals from text descriptions.

  • It’s slowly rolling out to all ChatGPT users, but Sora is your best bet if you’re not seeing it yet.

OpenAI's new image generation model handling text, from OpenAI's X(Twitter)
OpenAI's new image generation model handling text, from OpenAI's X(Twitter)

OpenAI’s new GPT‑4o model is breaking new ground by making AI image creation part of its standard toolkit. Instead of just handling text or images separately, it now does both in one go, producing better, more realistic visuals that actually match what you’re describing. It also gets better at understanding your prompts. You give it details, and it actually listens. Plus if you upload a photo for inspiration it can tweak or recreate it based on your description.

An image taken with a phone of a glass whiteboard, in a room overlooking the Bay Bridge. The field of view shows a woman writing, sporting a tshirt wiith a large OpenAI logo. The handwriting looks natural and a bit messy, and we see the photographer's reflection.
An image taken with a phone of a glass whiteboard, in a room overlooking the Bay Bridge. The field of view shows a woman writing, sporting a tshirt wiith a large OpenAI logo. The handwriting looks natural and a bit messy, and we see the photographer's reflection.

The most mindblowing part is how it handles text in images. It actually writes stuff that’s readable and makes sense which was a huge problem for AI before. And it’s not just about slapping text onto an image—it’s about making it fit the scene perfectly. Previously, I think the leaders of readable text in images were Ideogram, Recraft. Just yesterday Reve the new model was dropped that boasted great longer readable text capabilities. But with this latest release by OpenAI this is mindblowing.

You can prompt with just emojis
You can prompt with just emojis

What is even more impressive is that it is going to have editing capabilities right in the chat, very similar to what Google has done with Gemini. I am excited for this, but also a bit sad that these groundbreaking shifts are happening again with largest players leading. Always nice to see more competition and small businesses, I hope they will not end up getting crushed by what the leaders of the game can offer.

The model can make all sorts of things, including step-by-step guides, memes and infographics:

Infographics example, sourced from top images on Sora
Infographics example, sourced from top images on Sora

Prompt was just:

visualize an infographic explaining Newton's prism experiment in great detail, dark blue background

How Can You Try It?

GPT‑4o’s image generation is slowly being rolled out to all ChatGPT users even on the free tier but not everyone has it yet. Turns out asking your ChatGPT model what it’s using might help—many are still running Dall-E.

chatgpt-answers-about-dall-e
Mine isn't ready from the chat yet. I'm a Plus member.

So if you really wanna try the new thing right now Sora’s the way to go. Or just wait, it'll arrive in the chat eventually.

Related Posts

Visitor Comments

Please prove you are human by selecting the tree.