Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Normalization helps somewhat, though it's another expense at string creation time

And needs to be handled with care...there are edge cases where it can bite you. For example, if Unicode is being used internally to process data in a legacy CJK encoding, normalisation may lose distinctions that are needed for accurate round-trip conversion.

Another surprise "gotcha" is that simply concatenating two already-normalised strings may give you a result that is not normalised.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: