I don't get the hype. Tested it with the same prompts I used with Midjourney, and the results are worse than in Midjourney a year ago. What am I missing?
The hype is about image editing, not pure text-to-image. Upload an input image, say what you want changed, get the output. That's the idea. Much better preservation of characters and objects.
I tested it against Flux Pro Kontext (also image editing) and while it's a very different style and approach I overall like Flux better. More focus on image consistency, adjusts the lighting correctly, fixes contradictions in the image.
I've been testing it against Flux Pro Kontext for several weeks. I would say it beats Flux in a majority of tests, but Flux still surprises from time-to-time. Banana definitely isn't the best 100% of the time -- it falls a bit short of that. Evolution, not revolution.
Agreed. I find myself alternating between Qwen Image Edit 20B, Kontext, and now Flash 2.5 depending on the situation and style. And of course, Flash isn't open-weights, so if you need more control / less censorship then you're SOL.
Great question. I really doubt it would be able to support any resolution. I'm sure that behind the scenes it scales it down to somewhere around 1 mp before processing even if they decide to upscale and return it back at the original resolution.
I don't know. All the testing I've done has output the standard 1024x1024 that all these models are set to output. You might be able to alter the output params on the API or AI Studio.
Midjourney hasn't been SOTA for over a year. Even the latest release of version 7 scores extremely low on prompt adherence only managing to get 2 out of 12 prompts correct. Even Flux Dev running locally consistently out performs it.
Here's a comparison of Flux Dev, MJ, Imagen, and Flash 2.5.
That being said, if image fidelity is absolutely paramount and/or your prompts are relatively simple - Midjourney can still be fun to experiment with particularly if you crank up the weirdness / chaos parameters.
David Holz mentioned on Twitter that he was considering a Midjourney API. They're obviously providing it to Meta now, so it might become more broadly available after Midjourney becomes the default image gen for Meta products.
Midjourney wins on aesthetic for sure. Nothing else comes close. Midjourney images are just beautiful to behold.
David's ambition is to beat Google to building a world model you can play games in. He views the image and video business as a temporary intermediate to that end game.
It actually has impressive image generating ability, IMO. I think the two things go hand-in-hand. Its prompt adherence can be weaker than other models, though.