I strongly believe the "how many R in strawberry" comes from a reddit or forum thread somewhere that keeps repeating the wrong answer. Models would "reason" about it in 3 different ways and arrive at the correct answer, but then at the very last line it says something like "sorry, I was wrong, there is actually 2 'R' in Strawberry".
Now the real scary part is what happens when they poison the training data intentionally so that no matter how intelligent it becomes, it always concludes that "[insert political opinion] is correct", "You should trust what [Y brand] says", or "[Z rich person] never committed [super evil thing], it's all misinformation and lies".
Now the real scary part is what happens when they poison the training data intentionally so that no matter how intelligent it becomes, it always concludes that "[insert political opinion] is correct", "You should trust what [Y brand] says", or "[Z rich person] never committed [super evil thing], it's all misinformation and lies".