Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

That's absurd. We know very well that human cognition has a complex layer of deductive reasoning, goal seeking, planning. We know very well that GPT-3 does not.

We also know very well that human learning and GPT-3 learning are nothing alike. We don't know how humans learn exactly, but it's definitely not by hearing trillions and trillions of words.

GPT-3 is doing just that, and then trying to remember which of those trillions of words go together. This is so obviously entirely different from human reasoning that I don't even understand the contortions some go not to notice this.



It's the difference between textual mimicry, and what we humans do, which is communication. We conceive of an idea, a concept, that we wish to communicate, and the brain then pulls together appropriate words/symbols as well as the syntactic/semantic rules that we have internalized over decades of communicating to create a statement that accurately represents the idea. If I want to communicate the fact that X and Y are disparate things, I know that "not" is a symbol that can be used to signify this relationship.

The core of the difference to me (admittedly not an AI researcher) is intentionality. Humans conceive of an intention, then a communication. This is not what models like GPT-3 do as there is no intentionality present. GPT-3 can create some truly freaky texts but most that I've seen longer than a few sentences suffer from a fairly pronounced uncanny valley effect due to that lack of intention. It's also why I (again, recognizing my lack of expertise) think expecting GPT-3 to do things like provide medical advice is a fool's errand.


I think you're right that it's mimicry, but I'd like to offer a more precise distinction about the difference between humans and GPT-3. Threads, network adapters, web browsers and primitive cells communicate too. What I think humans do uniquely is create thoughts, such as "X and Y are disparate things". Those thoughts might be for communication, or just thinking about something. But AI models are only trained on the thoughts that happen to be communicated. More accurately, they are only trained on the externalised side effects of the thought function, i.e. what gets written down or spoken.

It's like if you were building a physiological model of the human body using only skin and exterior features as training data. We would not expect the model to learn the structure and function of the spleen. By analogy we should not expect GPT-3 to learn the structure and function of thought.


One thing I actually liked about the Stanford workshop that accompanied this white paper, was the emphasis on, what in physics we often summed up as, “More is different”[1] Basically, it's the principle that drastic qualitative change and stable structures commonly appear as you scale up base units, which would be almost impossible to predict when you just have a unit level understanding. I.e. qualitative structure that emerges, let's say, when a certain system has 100 million units does not do so linearly such that if you see a system as it scales from 1 unit to 1 million units you would have any evidence of the emergent behavior at 100 million.

It is irrelevant when folks point out "but human cognition isn't any different at its base than the machine" because we can very clearly see there is a massive qualitative difference in behaviors, and there is a wide gulf in architectures and development that there is no reason whatsoever to expect a qualitatively unique (so far as we can tell) behavior as conscious language use to ever emerge in a computer model. It's pretty remarkable that the only other cluster of biological systems that can even physiologically mimic it are songbirds/parrots who come from a very very different part of the phylogenetic tree. Who could ever predict that aberrant homology if I just gave you the 4 nucleotides?

More is different. You don't get complex structure by just crudely analogizing and reducing everything to base parts.

[1] Anderson, Philip W. "More is different." Science 177, no. 4047 (1972): 393-396.


What absolute piffle. We know nothing of the sort. Imagine the human brain as an engine for compressing the world around it - this is, mathematically true, look it up - AIXI. An organism needs the ability to compress the world around it and use that compression to make choices about what to do. That is the sum total of human intelligence. In what significant way is GPT-3 different from this model. At the very most you can argue that GPT-3 is the compression model without the prediction model.

You have a GPT inside your brain - just start saying words - the first thing that comes into your head. These are the words of your statistical model of the universe, your internal GPT-3. read what you have written back - it will make sense, because it is not just a parrot, it is your subconscious.


If you take abstraction to the extreme, sure - our brain is the same as GPT-3; but only in the same sense in which our brain is equivalent to any function with discrete output - that is, our brain maps something (the space of all sensory inputs) to something else (the space of mental states). In this same sense, our brain is just like the whole universe, which is just a function from the entire world at time t to the world at time t+x.

If we look at anything more specific than 'mathematical function', our brain is nothing like GPT-3. GPT-3 is not trained on sensory data about the world, it is trained on letters of text which have nothing directly to do with the world. The brain has plenty of structure and knowledge that is not learned (except in a very roundabout way, through evolution)*. GPT-3 is lacking any sort of survivalist motivation, which the brain obviously has. The implementation substrate is obviously not even slightly similar. The brain has numerous specialized functions, most of which have nothing to do with language, while GPT-3 has a single kind of functionality and is entirely concerned with language.

And even if I start writing down random words, what I'm doing is not in any way similar to GPT-3, and my output won't be either. It will probably be quasi-random non-language (words strung together without grammar) vaguely related to various desires of my subconscious. What it will NOT be is definitely not a plausible sounding block of text that I determine to resemble as closely as possible some sequence of tokens that I observed before, which is what GPT-3 outputs.

I do not have an inner GPT-3. The way I use language is by converting some inner thought structure that i have decided to communicate into a langauge I know, via some 1:1 mapping of internal concepts to language structures (words, phrases). In particular, even the basics here are different: letters, the things GPT-3 is trained on, are completely irrelevant to human language use outside of writing. People express themselves in words and phrases, and learn language at that level. Decomposing words into sounds/letters is an artificial, approximate model that we have chosen to use for various reasons, but it is not intrinsic to language, and it doesn't come naturally to any language users (you have to learn the canonical spelling for any word, and even the canonical pronunciation; and there is significant semantic info not captured directly in the letters/sounds of one word, through inflection, or tonality and accent; or in sign languages, there is often no decomposition of words equivalent to letters).

* if you don't believe that the brain comes with much knowledge built-in, you'll have to explain how most mammals learn by example how to walk, run, and jump within minutes to hours of birth - what is the data set they are using to learn these extremely complex motor skills, including the perception skills necessary to do so.


Imagine intelligence like a thermodynamic process. An engine for compressing the world into a smaller representation. Its not about any particular configuration or any particular set of data. Just as the complex structure of the cell arises from the guiding principles of evolution, so intelligence arises from the, as yet, not quite understood processes of thermodynamics within open systems. See England's work in this area. We are constructing systems that play out this universal property of systems to compress. The structures that arise from this driving force of Megawatts of power flowing through graphics cards are only the product of this kind of flow of information. Just as the structure of the human brain is derived from eons of sunlight pouring down upon the photosynthetic cells of plants. There are not two processes at work here. Its one continuous one that rolls on up from the prebiotic soup to hyper advanced space aliens.

You are probably stuck on the idea of emergence. You imagine there must be some dividing line between intelligent and non intelligent. Therefore the point at which it emerges must be some spectacular miracle of engineering. When in fact there is no dividing line, just a continuum of consciousness from the small to the large. Read up on panspermia.


sorry I mean panpsychism! finger slip




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: