Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I actually went through all projects listed in [1] because I remember this very quirk. It turns out that there are many such libraries that have two variants of encode/decode functions, where the second variant prepends a varint length. In my brief inspection there do exist a few libraries with only the second variant (e.g. Rust quick-protobuf), which is legitimately problematic [2].

But if the project in question was indeed protobuf.js (see loeg's comments), it clearly distinguishes encode/decode vs. encodeDelimited/decodeDelimited. So I believe the project should not be blamed, and the better question would be why so many people chose to add this exact helper. Well, because Google itself also had the same helper [3] [4]! So at this point protobuf should just standardize this simple framing format [5], instead of claiming that protobuf has no obligation to define one.

[1] https://github.com/protocolbuffers/protobuf/blob/main/docs/t...

[2] https://github.com/tafia/quick-protobuf/issues/130

[3] https://protobuf.dev/reference/java/api-docs/com/google/prot...

[4] https://github.com/protocolbuffers/protobuf/blob/main/src/go...

[5] Use an explicitly different name though, so that the meaning of "encoding/decoding protobuf messages" doesn't change.



Yep, and this variant to the encoding is documented at https://protobuf.dev/programming-guides/techniques/#streamin...

Definitely seems to be a routine addition to the standard supported by many libraries.


> Yep, and this variant to the encoding is documented at [...]

It only suggests the length prefix and doesn't define the exact encoding at all.


This is not a "variant". Top-level "message" has no length. Implementations that add length are free to do it in whatever way they like, but that's not part of the format because the format itself doesn't have a concept of "sequence of messages". It has lists, but those need to be inside of messages.

And since some do it in the same way, sometimes it works. The two typical approaches I found in the wild: use varint as a length and do nothing at all. Typically, the second one implies that the user, if they want to send a sequence of messages need to get creative and invent some form of connecting them together.

GRPC is the first kind. So, all those using GRPC rather than straight-up Protobuf are shielded from this problem.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: