I think there's more nuance. It's hard applying tests designed for humans to a model that can remember most of the useful text on the internet.
Imagine giving a human with a condition that leaves them without theory of mind weeks of role-play training about theory of mind tests, then trying to test them. What would you expect to see? For me I'd expect something similar to ChatGPT's output: success on common questions, and failures becoming more likely on tests that diverge more from the formula.
Imagine giving a human with a condition that leaves them without theory of mind weeks of role-play training about theory of mind tests, then trying to test them. What would you expect to see? For me I'd expect something similar to ChatGPT's output: success on common questions, and failures becoming more likely on tests that diverge more from the formula.