This is so ridiculous. GPT is like that "know-it-all" friend we have who just ha...

p-e-w · on May 2, 2023

> It outputs what you want to hear

Nonsense. How could it possibly know "what you want to hear"?

momojo · on May 2, 2023

I think "What you want to hear" is in the same spirit as an undergraduate attempting to regurgitate half-baked knowledge into an answer that will get him the grade.

Maybe a better rephrasing would be "ChatGPT has been trained to give answers to questions in a clear, confident manner, regardless of the content"

From the announcement post: https://openai.com/blog/chatgpt

> We trained this model using Reinforcement Learning from Human Feedback (RLHF), using the same methods as InstructGPT, but with slight differences in the data collection setup. We trained an initial model using supervised fine-tuning: human AI trainers provided conversations in which they played both sides—the user and an AI assistant.

So I agree with OP; it's been trained to give answers that sound plausible but not necessarily correct. It's even mentioned in the "Limitations" section at the bottom of the blog post.

> ChatGPT sometimes writes plausible-sounding but incorrect or nonsensical answers. Fixing this issue is challenging, as: (1) during RL training, there’s currently no source of truth; (2) training the model to be more cautious causes it to decline questions that it can answer correctly; and (3) supervised training misleads the model because the ideal answer depends on what the model knows, rather than what the human demonstrator knows.

kumarvvr · on May 2, 2023

Because we tell it what we want.

"How do I do this"

"How do I do that"

"What is this"

"What is that"

wrycoder · on May 2, 2023

And it tries to give an answer that would be interpreted as a positive, rather than negative, reply.

jhugo · on May 2, 2023

It outputs what some idealised version of a person wants to hear, where what is "idealised" has been determined by its training. I've noticed, for example, that it appears to have been trained to want to give responses that seem helpful, and make you trust it. When it's outputting garbage code that doesn't work, it will often say things like "I have tested this and it works correctly", despite that being an impossibility.

JoshuaDavid · on May 3, 2023

Sometimes what I want to hear is "the thing that the typical person who writes about this sort of thing would say about my particular question". If I can figure out whether or not to trust an internet rando who may or may not know what they're talking about, I can figure it out when I'm talking to a simulated internet rando.