Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

LLMs are like children; telling them to not do something puts the idea in their 'head'.

Instead, telling them to do the opposite works. "Brevity is appreaciated", or "Preserve Tokens and be concise."



It’s called the waluigi problem and is also part of the reason why you can never fully “censor” an LLM; there is always some jailbreak possible




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: