Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Protobuf string fields are expected to be in UTF-8. Although I'd expect a sane implementation of a protobuf decoder to throw an exception or otherwise indicate an error if it receives a malformed protobuf encoding.


Protobuf decoders are expected to validate UTF-8 strings for syntax="proto3" files, but not syntax="proto2". The behavior diverges mostly for historical reasons.

This is a validation pass only and it doesn't make any meaning of the code points, except to validate that none of them are surrogate code points (disallowed in UTF-8).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: