Most people didn't think we were anywhere close to LLM's five years ago. The capabilities we have now were expected to be a decades away, depending on who you talked to. [EDIT: sorry, I should have said 10 years ago... recent years get too compressed in my head and stuff from 2020 still feels like it was 2 years ago!]
So I think a lot of people now don't see what the path is to AGI, but also realize they hadn't seen the path to LLM's, and innovation is coming fast and furious. So the most honest answer seems to be, it's entirely plausible that AGI just depends on another couple conceptual breakthroughs that are imminent... and it's also entirely plausible that AGI will require 20 different conceptual breakthroughs all working together that we'll only figure out decades from now.
True honesty requires acknowledging that we truly have no idea. Progress in AI is happening faster than ever before, but nobody has the slightest idea how much progress is needed to get to AGI.
What people thought about LLMs five years ago, and how close we are to AGI right now are unrelated, and it's not logially sound to say "We were close to LLMs then, so we are close to AGI now."
It's also a misleading view of the history. It's true "most people" weren't thinking about LLMs five years ago, but a lot of the underpinnings had been studied since the 70s and 80s. The ideas had been worked out, but the hardware wasn't able to handle the processing.
> True honesty requires acknowledging that we truly have no idea. Progress in AI is happening faster than ever before, but nobody has the slightest idea how much progress is needed to get to AGI.
> Most people didn't think we were anywhere close to LLM's five years ago.
That's very ambiguous. "Most people" don't know most things. If we're talking about people that have been working in the industry though, my understanding is that the concept of our modern day LLMs aren't magical at all. In fact, the idea has been around for quite a while. The breakthroughs in processing power and networking (data) were the hold up. The result definitely feels magical to "most people" though for sure. Right now we're "iterating" right?
I'm not sure anyone really see's a clear path to AGI if what we're actually talking about is the singularity. There are a lot of unknown unknowns right?
AGI is a poorly defined concept because intelligence is a poorly defined concept. Everyone knows what intelligence is... until we attempt to agree on a common definition.
Not sure what history you're suggesting I check? I've been following NLP for decades. Sure, neural nets have been around for many decades. Deep learning in this century. But the explosive success of what LLM's can do now came as a huge surprise. Transformers date to just 2017, and the idea that they would be this successful just with throwing gargantuan amounts of data and processing at them -- this was not a common viewpoint. So I stand by the main point of my original comment, except I did just now edit it to say 10 years ago rather than 5... the point is, it really did seem to come out of nowhere.
GPT3 existed 5 years ago, and the trajectory was set with the transformers paper. Everything from the transformer paper to GPT3 was pretty much speculated in the paper, it just took people spending the effort and compute to make it reality. The only real surprise was how fast openai producterized an LLM into a chat interface with chatgpt, before then we had finetuned GPT3 models doing specific tasks (translation, summarization, etc.)
At this point, AGI seems to be more of a marketing beacon than any sort of non-vague deterministic classification.
We all thought about a future where AI just woke up one day, when realistically, we got philosophical debates over whether the ability to finally order a pizza constitutes true intelligence.
Notwithstanding the fact that AGI is a significantly higher bar than "LLM", this argument is illogical.
Nobody thought we were anywhere closer to me jumping off the Empire State Building and flying across the globe 5 years ago, but I'm sure I will. Wish me luck as I take that literal leap of faith tomorrow.
what's super weird to me is how people seem to look at LLM output and see:
"oh look it can think! but then it fails sometimes! how strange, we need to fix the bug that makes the thinking no workie"
instead of:
"oh, this is really weird. Its like a crazy advanced pattern recognition and completion engine that works better than I ever imagined such a thing could. But, it also clearly isn't _thinking_, so it seems like we are perhaps exactly as far from thinking machines as we were before LLMs"
Well the difference between those two statements is obvious. One looks and feels, the other processes and analyzes. Most people can process and analyze some things, they're not complete idiots most of the time. But also most people cannot think and analyze the most ground breaking technological advancement they might've personally ever witnessed, that requires college level math and computer science to understand. It's how people have been forever, electricity, the telephone, computers, even barcodes. People just don't understand new technologies. It would be much weirder if the populace suddenly knew exactly what was going on.
And to the "most groundbreaking blah blah blah", i could argue that the difference between no computer and computer requires you to actually understand the computer, which almost no one actually does. It just makes peoples work more confusing and frustrating most of the time. While the difference between computer that can't talk to you and "the voice of god answering directly all questions you can think of" is a sociological catastrophic change.
Why should LLM failures trump successes when determining if it thinks/understands? Yes, they have a lot of inhuman failure modes. But so what, they aren't human. Their training regimes are very dissimilar to ours and so we should expect alien failure modes owing to this. This doesn't strike me as good reason to think they don't understand anything in the face of examples that presumably demonstrate understanding.
Because there's no difference between a success and failure as far as an LLM is concerned. Nothing went wrong when the LLM produced a false statement. Nothing went right when the LLM produced a true statement.
It produced a statement. The lexical structure of the statement is highly congruent with its training data and the previous statements.
This argument is vacuous. Truth is always external to the system. Nothing goes wrong inside the human when he makes an unintentionally false claim. He is simply reporting on what he believes to be true. There are failures leading up to the human making a false claim. But the same can be said for the LLM in terms of insufficient training data.
>The lexical structure of the statement is highly congruent with its training data and the previous statements.
This doesn't accurately capture how LLMs work. LLMs have an ability to generalize that undermines the claim of their responses being "highly congruent with training data".
By that logic, I can conclude humans don't think, because of all the numerous times out 'thinking fails'.
I don't know what else to tell you other than this infallible logic automaton you imagine must exist before it is 'real intelligence' does not exist and has never existed except in the realm of fiction.
> Once AGI is declared by OpenAI, that declaration will now be verified by an independent expert panel.
I always like the phrase, "follow the money", in situations like this. Are OpenAI or Microsoft close to AGI? Who knows... Is there a monetary incentive to making you believe they are close to AGI? Absolutely. Take in this was the first bullet point in Microsoft's blog post.
If you use 'multimodal transformer' instead of LLM (which most SOTA models are), I don't think there's any reason why a transformer arch couldn't be trained to drive a car, in fact I'm sure that's what Tesla and co. are using in their cars right now.
I'm sure self-driving will become good enough to be commercially viable in the next couple years (with some limitations), that doesn't mean it's AGI.
There is a vast gulf between "GPT-5 can drive a car" and "a neural network using the transformer architecture can be trained to drive a car". And I see no proof whatsoever that we can, today, train a single model that can both write a play and drive a car. Even less so one that could do both at the same time, as a generally intelligent being should be able to.
If someone wants to claim that, say, GPT-5 is AGI, then it is on them to connect GPT-5 to a car control system and inputs and show that it can drive a car decently well. After all, it has consumed all of the literature on driving and physics ever produced, plus untold numbers of hours of video of people driving.
>There is a vast gulf between "GPT-5 can drive a car" and "a neural network using the transformer architecture can be trained to drive a car".
The only difference between the two is training data the former lacks that the latter does so not a 'vast gulf'.
>And I see no proof whatsoever that we can, today, train a single model that can both write a play and drive a car.
You are not making a lot of sense here. You can have a model that does both. It's not some herculean task. it's literally just additional data in the training run. There are vision-language-action models tested on public roads.
> single model that can both write a play and drive a car.
It would be a really silly thing to do, and probably there are engineering subletities as to why this would be a bad idea, but I don't see why you couldn't train a single model to do both.
It's not silly, it is in fact a clear necessity to have both of these for something to be even close to AGI. And you additionally need it trained on many other tasks - if you believe that each task requires additional parameters and additional training data, then it becomes very clear that we are nowhere near to a general intelligence system; and it should also be pretty clear that this will not scale to 100 tasks with anything similar to the current hardware and training algorithms.
this is something I think about. state of the art in self driving cars still makes mistakes that humans wouldn't make, despite all the investment into this specific problem.
This bodes very poorly for AGI in the near term, IMO
So I think a lot of people now don't see what the path is to AGI, but also realize they hadn't seen the path to LLM's, and innovation is coming fast and furious. So the most honest answer seems to be, it's entirely plausible that AGI just depends on another couple conceptual breakthroughs that are imminent... and it's also entirely plausible that AGI will require 20 different conceptual breakthroughs all working together that we'll only figure out decades from now.
True honesty requires acknowledging that we truly have no idea. Progress in AI is happening faster than ever before, but nobody has the slightest idea how much progress is needed to get to AGI.