Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The grounding problem is an intelligence problem, not an artificial intelligence problem.

How would you envision a test based on one-shot learning working?



The question of grounding is a problem that arises in thinking about cognition in general, yes. In AI, it changes from a theoretical problem to a practical one, as this whole discussion proves.

As for one-shot learning, what I was driving at, is that a truly intelligent system should not need to consume millions of documents in order to predict that, say, driving at night puts larger demands on one's vision than driving during the day. Or any other common sense fact. These systems require ingesting the whole frickin' internet in order to maybe kinda sometimes correctly answer some simple questions. Even for questions restricted to the narrow range where the system is indeed grounded: the world of symbols and grammar.


Why do you believe that a system should not need to consume millions of documents in order to be able to make predictions?

For your example, the concepts of driving, night, vision, all need to be clearly understood, as well as how they relate to each other. The idea of 'common sense' is a good example of something which takes years to develop in humans, and develops to varying extents (although driving at night vs at day is one example, driving while drunk and driving while sober is a different one where humans routinely make poor decisions, or have incorrect beliefs).

It's estimated that humans are exposed to around 11 million bits of information per second.

Assuming humans do not process any data while they sleep (which is almost certainly false): newborns are awake for 8 hours per day, so they 'consume' around 40GB of data per day. This ramps up to around 60GB by the time they're 6 months old. That means that in the first month alone, a newborn has processed 1TB of input.

By the age of six months, they're between 6 and 10TB, and they haven't even said their first word yet. Most babies have experienced more than 20TB of sensory input by the time they say their first word.

Often, children are unable to reason even at a very basic level until they have been exposed to more than 100TB of sensory input. GPT-3, by contrast was trained on a corpus of around 570GB worth of text.

We are simply orders of magnitude away from being able to make a meaningful comparison between GPT-3 and humans and determine conclusively that our 'intelligence' is of a different category to the 'intelligence' displayed by GPT-3.


I was thinking in terms of simple logic and semantics. The example I picked though muddied the waters by bringing in real-world phenomena. A better test would be anything that stays strictly within the symbolic world - the true umwelt of the language model. So, anything mathematical. After seeing countless examples of addition and documents discussing addition and procedures of addition, many order of magnitude more than a child ever gets to see when learning to add, still LLMs cannot do it properly. That, to me, is conclusive.


A child can 'see' maths though, they can see that if you have one apple over here and one orange over there, then you have two pieces of fruit all together.

If you only ever allowed a child to read about adding, without ever being able to physically experiment with putting pieces together and counting them, likely children would not be able to add either.

In fact, many teachers and schools teach children to add using blocks and physical manipulation of objects, not by giving countless examples and documents discussing addition and procedures of addition.

You may feel it's conclusive, and it's your right to think that. I am not sure.


Yet ChatGPT totally - apparently - gets 1 + 1. In fact it aces the addition table way beyond what a child or even your average adult can handle. It's only when you get to numbers in the billions that it's weaknesses become apparent. One thing it starts messing up is carry-over operations, from what I can see. Btw. the treshold used to be significanly lower, yet that doesn't convince me in the least that it's made progress in its understanding of addition. It's still just as much in the fog. And it cannot introspect and tell me what it's doing so I can point out where it's going wrong.

But I think you are right in what you are saying. Basically it not 'seeing' math as a child does, is just another way to say that it doesn't undestand math. It doesn't have a intuitive understanding of numbers. It also can't really experiment. What would experimenting mean in this context? Just more training cycles. This being math, one could have it run random sums and give it the correct answer each time. That's one way to experiment, but that wouldn't solve the issue. At some point it would reach its capacity of absorbing statistical corelations to deal with numbers large enough. It would need more neurons to progress beyond that stage.

Btw. I found this relevant article: https://bdtechtalks.com/2022/06/27/large-language-models-log...


That’s an interesting read, thank you. But my question is a bit more fundamental than that.

Ultimately, my point is that although the argument is that an LLM doesn’t “know” anything, I am not sure that there is something categorically different in terms of what we “know” vs what an LLM “knows”, we have just had more training on more different types of data (and the ability to experiment for ourselves).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: