thrift-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Randy Abernethy <>
Subject Re: Binary data (de)serializing
Date Wed, 02 Oct 2013 17:55:43 GMT
Hello fogbit,

You mention .proto files, I take it you mean .thrift.

The binary type implies a sequential block of memory and is probably the
best way to go for your binary buffer. List does not imply sequential block
of memory in most environments, rather it implies a data structure (with
fast insert times if it is mutable). So I think from a semantic stand point
you really do want binary.

>From an implementation stand point I think binary is still the best choice.
The C++ generator uses std::string for binary types and Python uses string
as well. C++03 strings are almost always linear and in C++11 they must be
(like vector). An Apache Thrift list will give you a vector in C++ but in
Python you will get a list. Strings and vectors are effectively identical
in most C++ implementations but strings and lists are quite different in

Hope this helps.


On Wed, Oct 2, 2013 at 2:39 AM, fogbit <> wrote:

> Hi.
> I need to (de)serializing a buffer containing binary data, particularly
> between C++ (sender/receiver) and Python (receiver) applications. What data
> type should i use in a .proto file?
> There is already a question at StackOverlow
> and there is an advise to use "binary" type. But in the same time the
> manual says "This is currently a specialized form of the string type above,
> added to provide better interoperability with Java. The current
> plan-of-record is to elevate this to a base type at some point." Also from
> what i read from the generated sources, at C++/Python "binary data type" is
> transformed into "string" type, while "list<byte>" evaluates into
> std::vector<uint*_t>/[].
> Haven't checked it yet, but i have suspicions that in the general case a
> string deserialization could be slower than plain vector due to check of a
> locale or something.
> Also, in my opinion, it's "ideologically" right to use list of bytes when
> one wants to store... a list of bytes, what a binary buffer is really is,
> actually.
> So, what data type should i use? "binary" or "list<byte>".
> Thanks!

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message