Protobuf string fields are expected to be in UTF-8. Although I'd expect a sane i...

haberman · on Dec 21, 2021

Protobuf decoders are expected to validate UTF-8 strings for syntax="proto3" files, but not syntax="proto2". The behavior diverges mostly for historical reasons.

This is a validation pass only and it doesn't make any meaning of the code points, except to validate that none of them are surrogate code points (disallowed in UTF-8).