Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yes, assuming the input is valid UTF-8.

UTF-8 bytes less than 0x80 are only used for the first 128 Unicode codepoints. Characters who need multiple bytes in UTF8 representation encode into bytes with high bits set, i.e. >= 0x80: https://en.wikipedia.org/wiki/UTF-8#Encoding



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: