Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Large scale pretrained models are certainly likely to figure prominently in artificial intelligence for the near future, and play an important role in commercial AI for some time to come.

This sounds reminiscent of the post-ImageNet days where people discovered that large scale pretraining offers a broadly useful feature space for vision tasks.

The ideas underlying foundation models certainly aren’t novel, but it seems like researchers in academia underestimated how effective large scale multitask feature spaces can be because of the lack of access to large scale data and compute.

There are actually several other ideas like this that large companies have a good understanding of that haven’t made their way to academia yet. I suspect there are going to be more articles like this in the coming future claiming a paradigm shift because of the disconnect between industry and academia.



Sorry, but these models have unexplainable failure modes.

You can ask a 5 yo "why you did this", in fact they will go and ask this penetrating question all the time.

The ML mentioned here will spew garbage if asked that question, or is completely unable to do so. Putting something like that anywhere in a decision framework is dangerous in the extreme.

I would not call something that requires thousands of teaching examples effective.


You can ask most adults why they did something and not get a coherent answer either but ok, I’ll address your main point of there being failure modes instead of nitpicking your example.

First off, I said “researchers underestimated how effective” they are, not that they are perfect or that they are close to AGI.

Effective is defined as “successful in producing a desired or intended result”. To that end, these large scale language models have certainly outperformed previous models, certainly beyond the previous expectations of researchers.

If you read the paper closely, you’ll see that the point is not to solve every general case (e.g. introspection) like you’re talking about but rather to be fine tuned for specific downstream tasks. There was never a claim that these models are able to do every task so I don’t get your criticism.

Either way, I don’t think edge cases are completely avoidable so systems that use these models have to have some error tolerance. I disagree that a model needs to explicitly answer the question of “why you did this” in order to be useful.


>You can ask most adults why they did something and not get a coherent answer either but ok

This sort of cynicism about humans is very popular among certain AI researchers, but I can't say that it matches my experience. Leaving out instances where people are being deliberately dishonest about their motivations (which are quite common when you consider 'white lies' and other minor social lies), people do usually give coherent explanations of their actions.


Humans often give coherent _rationalizations_ of their actions, which is slightly different than an explanation.

From wikipedia[1]: "the inventing of a reason for an attitude or action the motive of which is not recognized—an explanation which (though false) could seem plausible"

This is very common in children but also in adults: people will do something and then find reasons to support why they did that. Much of our modern industry is indeed built around this very concept.

[1] https://en.wikipedia.org/wiki/Rationalization_(psychology)


It certainly happens, but how common is it really? I think we tend to notice these cases much more than the boring everyday cases where people do things for straightforward reasons that they’re fully aware of.

Vocalized rationalizations (e.g. “I didn’t want that job anyway” from the Wikipedia article) often don’t reflect actually delusional beliefs about motivation. A person who says such a thing typically is aware that they really did want the job when they applied for it.

People will naturally try to find the best rational justification for their actions post hoc, but it doesn’t follow from this that they hold incorrect beliefs about their original motivations.


> Humans often give coherent _rationalizations_ of their actions, which is slightly different than an explanation.

At least those rationalizations are logically plausible as explanations; they very well could be (usually) the actual explanation.

What's now called AI cannot, AIUI, convincingly come up with any "explanation" of its reasoning, simply because it hasn't done any actual reasoning.

I don't know if most fans or current "AI" really can't see the difference, or are just pretending. (If they really can't, one can't help wonder if they are themselves current "AIs"...)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: