This is an absurd argument that you are just making for the sake of argument and you know it. There's a difference between someone getting things wrong and a LLM shamelessly making something up out of the whole cloth.
No; I'm much more confident in ChatGPT's accuracy than I am in a random podcast's accuracy.
Podcasts tend to be relatively terrible, accuracy wise. If the alternative is learning from ChatGPT, your odds of getting correct information is substantially higher.
Podcasts are entertaining! But not where I would go to learn anything.
Problem with ChatGPT isn't that they get facts wrong, but that they're what the categorical name suggests, large language models.
At one point I came across this series of "are CJK languages related" questions in Quora with cached ChatGPT responses[1], all grammatically correct and very natural, largely turboencabulators, sometimes contradicting even within a single response.
Be aware that you're talking about Quora's implementation of ChatGPT here. As far as I know, the cached answers were generated with an incredibly outdated version, which is definitely not indicative of its current quality.
Even worse, I think they actually prime it with answers already posted on the thread, or even just related threads.
For example, one of the answers to the first question mentions the same Altaic root as ChatGPTs answer, and I've found multiple people that are seeing their own rephrased answers in the response.
If you preprompt ChatGPT with questionable data, then the answer quality will be massively degraded. I've noticed many times now that Bing will rephrase incorrect information or construct a very shallow summary out of unrelated articles when internet searches are allowed, but is able to generate a cohesive and detailed summary when they're disabled.
Throwing random answers - some contradicting each other and some talking about subtly different aspects of the topic - into a session without further guidance just isn't a great idea.
That part is correct, but not consistent with others.
My problem is not that GPTs are too often wrong, it's that they always prioritize syntax over facts since they are _language_ models.
The sentence "Colorless green ideas" always make more sense to LLM than "Water is wet", simply because the latter is syntactically invalid, and that would be problematic for many use cases including Podcast replacement. Sometimes us humans want AI to say "water is definitively wet", and that has been attempted by forcing LLM to accept that factoids are more syntactically correct, but that isn't a solution and it's still an architectural problem for these pseudo-AGI apps.
There are podcasts for almost every topic where experts are present. Journalists, Scientists, Activists, Researchers etc. can be heard in podcasts, I don't really see why it's generally a mistake to listen to a podcast to learn important information.
During the pandemic my partner was attending university from home and listening to their professors via MS Teams, these classes were also recorded so that they could listen to them at a later point. In some ways that's just a professional podcast.
Of course the part about university classes is different, but you seem to ignore everything else I've said.
There are tons of podcasts involving experts talking about their field of expertise, how can it be a mistake to listen to such podcasts to gather information?
It’s a mistake because what drives podcasts are their popularity, not their accuracy.
There is no “sort by accuracy” button in any podcasting app, nor are they peer reviewed.
Furthermore, podcasts are not a review of the body of knowledge on a subject; they’re often a complete layperson interviewing a single member of a given field, at best. Almost never do the views of any individual actually represent any field as a whole.
So once we’ve thrown out the concept of accuracy and completeness, ChatGPT fares exceedingly well in comparison. You’d do much worse than ChatGPT for idle conversation level accuracy.
What you just wrote makes more sense applied to LLM output than podcasts! You'd just as easily argue that "radio" or "news" is all bad if you don't want to differentiate between different forms of expression and communication within a medium. (Which, obviously, would be silly)
Sorry what? Nothing I wrote apples to LLMs; they are not optimized for popularity, they’ve been meticulously designed and built to be as accurate as possible.
No they're not. If they were, they would default to 0 temperature and have no Top P, frequency/presence penalty, and frankly not have knowledge as a function of language to begin with. They're designed to be convincing as a "presence" and output reasonable sounding language in context, with accuracy as an afterthought.
That link doesn't say anything about the fundamental design goals of the network architecture or training process. It doesn't even mention factual correctness, except in the sense that it may broadly fall under "producing a desired output".
But that's the problem; you falsely believe you're capable of differentiating between propaganda and quality. You're not, on topics of which you are not an expert.
That's what flat-eartherism is, that's what jewish space lasers are. The argument you're giving is a tacit endorsement of that kind of "inquiry", which for reasonable people is unconscionable.
Slightly off-topic/meta but this debate reminds me of one people had 20 years ago: Should you trust stuff you read on Wikipedia or not?
In the beginning people were skeptical but over time, as Wikipedia matured, the answer has become (I think): Don't blindly trust what you read on Wikipedia but in most cases it's sufficiently accurate as a starting point for further investigation. In fact, I would argue people do trust Wikipedia to a rather high degree these days, sometimes without questioning. Or at least I know I do, whether I want to or not, because I'm so used to Wikipedia being correct.
I'm wondering what this means for the future of LLMs: Will we also start trusting them more and more?
Sure but that doesn't make it "random" . I don't get why top comment was listening to podcasts knowing full well that you will be given possibly inaccurate facts. I have hard time accepting that podcasts are audible wikipedia, such a strange take is top voted comment.
I listen to podcasts; they’re fun! But I am comfortable operating with incomplete information, and thus know how to treat a low quality source. I also chat with LLMs to better understand topics, and am better off than the folks here who can’t do that as a result.
The main thing I’m getting from this discussion is that a lot more very smart people seem to have deluded themselves into thinking knowledge is objective or “locked in” than I had initially realized. The desire for certainty is an extremely human thing, but it’s a dead end, intellectually.
yea exactly "chatgpt replaced my podcasts" is kind of silly. People don't listen to joe rogan experience to learn facts about how particular jiu jitsu choke works.
I don't know what kind of podcasts op replaced with chatgpt but i call BS .
Is there a similar difference between an LLM getting things wrong and someone shamelessly making something up out of the whole cloth? That's closer to what actually happens.
they definitely do but i think in podcasts i generally have a better ability to evaluate how much trust i should place in what i'm hearing. i know if i'm listening to a professional chef podcast, i can probably trust they will generally be right talking about how they like to bake a turkey. if i'm listening to the hot wings guy interview zac efron i know to be less blindly trusting of their info on astrophysics.
with chatgpt i don't know it's "experience" or "education" on a topic and it has no social accountability motivating it to make sure it gets things right so i can't estimate how much i should trust it in the same way.
I don't know how people use ChatGPT at all. It confidently hallucinated answers to 4 out of 5 my latest "real" questions, with code examples and everything. Fortunately with code I could easily verify the provided solutions are worthless. Granted I was asking questions about my niche that were hard enough I couldn't easily Google them or find a solution myself, but I think that's the bar for being useful. The only thing it got right was finding a marketing slogan.
I've done this and, from what I can tell, it is reasonably accurate. However, I did have an instance where I was asking it a series of questions about the First Peloponnesian War, and partway through our discussion it switched topics to the first part of the Peloponnesian War, which are different conflicts. At least, I think they are. It was quite confusing.
Nero was a Roman Emperor from 54 to 68 AD, known for his controversial and extravagant reign. He was the last emperor of the Julio-Claudian dynasty. Here are some key points about his life and rule:
1. *Early Life and Ascension*: Nero was born Lucius Domitius Ahenobarbus in 37 AD. He was adopted by his great-uncle, Emperor Claudius, becoming Nero Claudius Caesar Drusus Germanicus. He ascended to the throne at the age of 17, after Claudius' death, which many historians believe Nero's mother, Agrippina the Younger, may have orchestrated.
2. *Reign*: Nero's early reign was marked by influence from his mother, tutors, and advisors, notably the philosopher Seneca and the Praetorian Prefect Burrus. During this period, he was seen as a competent ruler, initiating public works and negotiating peace with Parthia.
3. *Infamous Acts*: As Nero's reign progressed, he became known for his self-indulgence, cruelty, and erratic behavior. He is infamously associated with the Great Fire of Rome in 64 AD. While it's a myth that he "fiddled while Rome burned" (the fiddle didn't exist then), he did use the disaster to rebuild parts of the city according to his own designs and erected the opulent Domus Aurea (Golden House).
4. *Persecution of Christians*: Nero is often noted for his brutal persecution of Christians, whom he blamed for the Great Fire. This marked one of the first major Roman persecutions of Christians.
5. *Downfall and Death*: Nero's reign faced several revolts and uprisings. In 68 AD, after losing the support of the Senate and the military, he was declared a public enemy. Facing execution, he committed suicide, reportedly uttering, "What an artist dies in me!"
6. *Legacy*: Nero's reign is often characterized by tyranny, extravagance, and debauchery in historical and cultural depictions. However, some historians suggest that his negative portrayal was partly due to political propaganda by his successors.
His death led to a brief period of civil war, known as the Year of the Four Emperors, before the establishment of the Flavian dynasty.
I also did a brief fact check of a few details here and they were all correct. Zero hallucinations.
Does this make sense? Notice how little it matters if my understanding of Nero is complete or entirely accurate; I’m getting a general gist of the topic, and it seems like a good time.
This is missing the broad concern with hallucination: You are putting your trust in something that delivers all results confidently, even if they were predicted incorrectly. Your counter-argument is lack of trust in other sources (podcasts, the education system), however humans, when they don't know something, generally say they don't know something, whereas LLMs will confidently output incorrect information. Knowing nothing about a certain subject, and (for the sake of argument) lacking research access, I would much rather trust a podcast specializing in a certain area than asking a LLM.
Put more simply: I would rather have no information than incorrect information.
I work in a field of tech history that is under-represented on wikipedia, but represented well in other areas on the internet and the web. It is incredibly easy to get chatGPT to hallucinate information and give incorrect answers when asking very basic questions about this field, whereas this field is talked about and covered quite accurately from the early days of usenet all the way up to modern social media. Until the quality of training data can be improved, I can never use chatgpt for anything relating to this field, as I cannot trust its output.
I am continually surprised by how many people struggle to operate in uncertainty; I am further surprised by how many people seem to delude themselves into thinking that… podcasts… can provide a level of certainty that an LLM cannot.
In life, you exceptionally rarely have “enough” information to make a decision at the critical moment. You would rather know nothing than know some things? That’s not how the world works, not how discovery works, and not even how knowledge works. The things you think are certain are a lot less so than you apparently believe.
It may matter little to you that your understanding is not complete or entirely accurate, but some of my worst experiences have been discussing topics with people who think they have anything insightful to add because they read a wikipedia page or listened to a single podcast episode and then decided that gave them a worthwhile understanding of something that often instead takes years to fully appreciate. A little knowledge and all of that. For one, you don't know what you're missing by omission.
Good thing nobody here is suggesting blind trust. The mistake being made here is thinking I’m suggesting LLMs are a good way to learn. What I am instead saying is that podcasts are not a good way to learn, and should be treated with the sane level of skepticism one holds for an LLM response.
i appreciate your confidence and would love to know how far you would go with the guarantee ! it makes me realize that there is at least one avenue for some level of trust about gpt accuracy and that's my general awareness of how much written content on the topic it probably had access to during training.
i think maybe your earlier comment was about the average trustworthiness of all podcasts vs the same for all gpt responses. i would probably side with gpt4 in that context.
however, there are plenty of situations where the comparison is between a podcast from the best human in the world at something and gpt which might have less training data or maybe the risks for the topic aren't eating an uncooked turkey but learning cpr wrong or having an airbag not deploy
There are zero podcasts from “the best person in the world”, the very concept is absurd.
No one person is particularly worth listening to individually, and as a podcast??? Good lord no.
LLMs beat podcasts when it comes to, “random exploration of an unfamiliar topic”, every single time.
The real issue here is that you trust podcasts so completely, by the way, not that ChatGPT is some oracle of knowledge. More generally, a skill all people need to develop is the ability to explore an idea without accepting whatever you first find. If you’re spending an afternoon talking with ChatGPT about a topic, you should be able to A) use your existing knowledge to give a rough first-pass validation of the information you’re getting, which will catch most hallucinations out of the gate, as they’re rarely subtle, and B) take what you learn with a hefty grain of salt, as if you’re hearing it from a stranger in a bar.
This is an important skill, and absolutely applies to both podcasts and LLMs. Honestly why such profound deference to podcasts in particular?
One thing that could happen, sooner rather than later, is that agents are deployed that have access to a facial database of information relevant to the topic. So, there could be a GPT of that day’s NYT Daily with a high ranked database of info on, say, Israel and Hamas, and the text synthesis generates an overview but the user can “ask” for clarifying details to drill down, and responses.