But this again is where a confabulation-based architecture just isn't appropriate. I need an AI summarizer that I know isn't filling in gaps from random internet scans, or confidently confabulating over some inconsistency or something. And I say that independent of any particular thing it may have been trained on on the Internet. I say that because the whole point of a summarizer is to not inject anything else into the summary. (If it wanted to point out certain gaps that could be useful, but I do not want them "filled in".)
Even if ChatGPT summarized these documents, you'd still have to go check the original documents, not for the usual reasons of "did the summary drop an important detail I care about" (intrinsic to the act of summarization, always debatable) but for the reason of "is this actually solely sourced from the summarized text or is this one juicy detail actually a confabulation?"
I think this is harder to do generally than it sounds.
The best kinds of facts are gleaned by crunching a lot of inputs / experience. For example, after decades of life, a wise human might have a strong sense that one approach is better than another, or that a certain pattern in life exists. But it's not really possible to point to the specific source of this idea, because it has a widely-spread base.
In the case of specific historic events, however, I would like a clever AI that can cite sources and separate its facts from its opinions.
No question it's hard to do, and I don't expect a human to do it absolutely perfectly, nor will I necessarily hold AIs to that standard either.
To be more concrete and in HN's wheelhouse, I do expect that if I ask a coworker to go study some technology and give me the highlights (which I in fact kind of did today), that when they come back with a key summary of the important API calls, their parameters, and how those parameters relate to our business, that they will not have simply made up a plausible-sounding API call name because the dice rolled in the sentence generator didn't quite land on the actual call name but picked the second-most-likely outcome instead. And then, having made up an API call, simply started confidently confabulating the parameters to this API call, what it does, etc. and potentially spinning off into a world of fiction about this API call and related calls and what prerequisites those have, etc.
The problem with confabulation-based tech is even if it's accurate you can't really ever know that. The tech itself doesn't "know" when it is confabulating, because it is always confabulating. It just so happens with reasonably high probability that if you poke it with a real question its maximum-probability confabulation will more-or-less resemble the truth, because that is the maximum probability of what it saw in its training data. That is, no joke, pretty amazing and cool. But it doesn't leave me wanting to trust the output of any such AI model.
Example: I just prompted ChatGPT with "I'm using a Go library called semago for managing my semaphores. What is an example of how it is used?" There is no such library. But it confabulated an entire library, attributed it to a specific GitHub user who does exist (and a quick scan of the repo says it's not an implausible attribution), and wrote a description of using the API that gives absolutely no hint it does not exist. Now, credit where credit's due, that's pretty impressive that the code snippet is a confabulation of a completely plausible library. Nevertheless, it is a complete confabulation, and the only real clue that it is is precisely that I knew I was prompting one. When one accidentally happens there is no clue in the text whatsoever.
(Then I told it it was wrong, and it helpfully linked me to a non-existent standard library type "sync.Semaphore from the standard library: This is a simple semaphore implementation that is based on the sync.Mutex type. It provides basic semaphore functionality, including the ability to acquire and release permits." which is completely wrong, a GitHub library that does exist but which it completely mischaracterized, and another non-existent GitHub library. I'm not upset, really. This is still impressive in its own way. But it is confabulation.)
(And, full disclosure, I tried to get ChatGPT to describe the historical founding of the 27th state of the United States, "Morgontana", and it resolutely refused. The way in which it did so leads me to believe this is a rules-based special case added in over the underlying confabulation engine, though. Rules-based systems are ultimately a known dead end, though, and the way in which they are a dead end will only be made worse by trying to integrate them into a confabulation engine.)
Absolutely. I'm not against the tech, I'm trying to spread word of what it really is. And that definitely includes finding ways in which it is useful and awesome. When you want confabulation, ChatGPT is definitely ground-breaking, a noticeable improvement over GPT-3 even.
"CiteGPT and will be able to substantiate the claims it makes."
It'll have to be a fundamentally different architecture. "Maximum probability extension" just isn't going to do that. I'm not saying that's impossible. Presumably it is possible. Humans do it, very far from perfectly (I'm below average on this front myself, I think), but we do it at least partially, so it's clearly possible somehow. But it'll have to be something other than just a scaled up GPT model.
This is the same problem as politics in the public discourse in general. Did that reporter drop the details that would actually drive my decision? You ultimately have to go back to the sources, and it all takes a tremendous amount of time.
Even if ChatGPT summarized these documents, you'd still have to go check the original documents, not for the usual reasons of "did the summary drop an important detail I care about" (intrinsic to the act of summarization, always debatable) but for the reason of "is this actually solely sourced from the summarized text or is this one juicy detail actually a confabulation?"