Reasoning was supposed to be that for "Open" AI, that's why they go to such lengths to hide the reasoning output. Look how that turned out.
Right now, in my opinion, OpenAI has actually a useful deep research feature which I've found nobody else matches. But there is no moat to be seen there.
If you've seen DeepSeek R1's <think> output, you'll understand why OpenAI hides their own. It can be pretty "unsafe" relative to their squeaky-clean public image.
I was looking at this the other day. I'm pretty sure OpenAI run the internal reasoning into a model that purges the reasoning and makes it worse to train other models from.
I might be mistaken, but originally the reasoning was fully hidden? Or maybe it was just far more aggressively purged. I agree that today the reasoning output seems higher quality then originally.
This is the commodification of models. There is nothing special about the new models but they perform better on the benchmarks.
They are all interchangeable. This is great for users as it adds to price pressure.