From the paper *> The pipeline (bottom) shows how diverse OpenImages inputs are ...

svantana · 2025-10-26T12:28:15 1761481695

That's a great website! Feature request: a button to toggle all the sliders left or right at the same time - would make it easier to glance the results without lots of finicky mouse moves.

vunderba · 2025-10-26T15:13:20 1761491600

Thanks. That's a great idea - I also incorporated @MattRix proposal of syncing the sliders. It should be up now!

MattRix · 2025-10-26T12:56:36 1761483396

Seconding this. Once you’ve seen the original image once, you don’t need to see it each time. The idea of syncing the sliders in the current group is a clever solution.

typpilol · 2025-10-26T03:25:33 1761449133

I love your site I stumble across it once a month it seems.

Or there's another very similar site. But I'm pretty sure it's yours

vunderba · 2025-10-26T05:21:30 1761456090

Thanks! It's probably the same site. It used to only be a showdown of text-to-image models (Flux, Imagen, Midjourney, etc), but once there was a decent number of image-to-image models (Kontext, Seedream, Nano-Banana) I added a nav bar at the top so I could do similar comparisons for image editing.

typpilol · 2025-10-26T06:45:01 1761461101

Yes that was exactly it.

How often do you update it? It seems like something new every time I check. Or I forget everything..

vunderba · 2025-10-26T15:34:02 1761492842

Honestly it's kind of inconsistent. Model releases sometimes seem to come in flurries - (it felt like Seedream and Nano-banana were within a few weeks of each other for example) and then the site will receive a pretty big update.

lukasb · 2025-10-26T03:42:26 1761450146

What do you use for evaluation? gemini-2.5-pro is at the top of MMLU and has been best for me but always looking for better.

vunderba · 2025-10-26T05:17:31 1761455851

Recently I've found myself getting the evaluation simultaneously from to OpenAI gpt-5, Gemini 2.5 Pro, and Qwen3 VL to give it a kind of "voting system". Purely anecdotal but I do find that Gemini is the most consistent of the three.

dangoodmanUT · 2025-10-26T15:07:37 1761491257

I found the opposite. GPT-5 is better at judging along a true gradient of scores, while Gemini loves to pick 100%, 20%, 10%, 5%, or 0%. Like you never get a 87% score.

motbus3 · 2025-10-26T07:48:40 1761464920

I am running similar experiment but so far, changing the seed of openai seems to give similar results. Which if that confirms, is concerning to me on how sensitive it could be

lukasb · 2025-10-26T05:35:22 1761456922

Interesting, I'll give voting a shot, thanks.

scotty79 · 2025-10-26T14:39:56 1761489596

Seedream seems to be clear winner