I know this is sarcasm but if you've been using LLMs for most than two weeks you've probably noticed significant improvements in both the models and the tooling.
Less than a year ago I was generated somewhat silly and broken unit tests with copilot. Now I'm generating entire feature sets while doing loads of laundry.
That's all true, yet the problem of hallucinations is as stark today as it was three years ago when GPT-3.5 was all the rage. Until that is solved, I don't think there's any amount of "smartness" of the models that can truly compensate for it.