I feel like part of the mess of modern web development is the idea that most people actually need to "do something more complex than the basics". So much rehashing and effort spent trying to force MPAs into a SPA shape just because it's trendy.
I graduated before LLMs but after WolframAlpha and I essentially cheated my way through calculus I and II. Lenient grade weighting made 90s on the homework and 60s on the exams enough to slide through with a C. Funnily enough, now that it's over a decade later and I know more about myself and my neurodivergent patterns, I feel much more able and interested in actually learning calculus. I'm looking forward to seeing the pedagogical changes that result from LLMs enabling this sort of trickery for all subjects.
I saw a relatively new AI company already valued at over $1 bn publish a "new technique" recently, complete with whitepaper and all. I looked at the implementation and... it was just querying four different models, concatenating the results and asking a fifth model to merge the responses together.
Makes me wish I had spent some time in college learning how to sell rather than going so hard on the engineering side.
I had a startup building course at university and I was kinda shocked when I learnt that VCs don't actually have all the knowledge you have so you have to know how to sell your ideas. Yep, knowing how to sell something at the right person (at the right moment) is just as important as having a good idea
The novel part of that paper was not merging the responses. The last model can, from the inputs, synthesize higher quality overall responses than any of the individual models.
It’s a bit surprising that works and your take on it is overly reductive largely because you’re wrong in your understanding of what it was doing.
I didn't just look at the implementation, I tried it as well. I was hoping it would work, but the aggregating model mostly either failed to properly synthesize a new response (merely dumping out the previous responses as separate functions) or erratically took bits from each without properly gluing them together. In every case, simply picking the best out of the four responses myself yielded better results.
Interesting, I've seen live demos working fairly well. I've also implemented something adjacent to the work and it works quite well too. I'm not sure why you had a hard time with it.
I am however working in a domain where verification isn't subjective so I know a good response from a bad response fairly easily. Things like this depend quite heavily on the model being used too in my experience.
As much as I'd like to do more with it, the "just use F#" idea flaunted in this thread is a distant pipe dream for the vast majority of teams.