Doesn't this just mean that the LLM ingested training data where people talk abo...

tppiotrowski · on Feb 26, 2024

I think if people took the time to understand how LLMs choose word weights based on training data, they would understand that these results are somewhat deterministic.

Instead, the preferred heuristic is to look for a bogeyman.

hajile · on Feb 26, 2024

That's a scary proposition as it means the most vocal side wins the AI battle even if their position is bad or even if it is morally evil.

If one side is constantly vocalizing their position while the other side remains silent, then the vocal side wins.

SketchySeaBeast · on Feb 26, 2024

It's almost like hoovering up the internet, feeding it into a computer, and then treating it like it's Deep Thought is a bad idea.

educaysean · on Feb 26, 2024

Garbage in, garbage out. This is why training data is important.

VivaLaPanda · on Feb 26, 2024

It's almost certainly the RLHF, not the base model.

viscanti · on Feb 26, 2024

But the base model, when its trained on the whole internet, will have some extreme biases on topics where there's a large and vocal group on one side and the other side is very silent. So RLHF is the attempt to correct for the biases on the internet.

leadingthenet · on Feb 26, 2024

> So RLHF is the attempt to correct for the biases on the internet.

...or it can be used to reinforce a specific ideology. Completely dependent on who does the RLHF and what their motivations are.