The author’s inability to imagine a model that’s superficially useful but danger... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		throwuxiytayq 8 days ago \| parent \| context \| favorite \| on: Alignment is capability The author’s inability to imagine a model that’s superficially useful but dangerously misaligned betrays their lack of awareness of incredibly basic AI safety concepts that are literally decades old.

theptip 8 days ago [–]

Exactly. Building a model that truly understands humans, and their intentions, and generally acts with, if not compassion then professionalism - is the Easy Problem of Alignment.

Starting points:

https://www.lesswrong.com/posts/zthDPAjh9w6Ytbeks/deceptive-...

https://www.lesswrong.com/w/sharp-left-turn

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact