thrift-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Reiss <dre...@facebook.com>
Subject Re: fastbinary.c utf8 support
Date Fri, 03 Sep 2010 19:13:00 GMT
That sounds fine.  If you mean wiki documentation, feel free to update it.
If you mean something in the source code, you can send us a diff.

--David

On 09/03/2010 12:10 PM, Leo Kim wrote:
> Okay, reading THRIFT-395 gives me an understanding of how fraught yet
> rigorously discussed this issue has been. I'm wondering if it may be
> worth underscoring this in the Thrift tutorial examples or other such
> documentation.
> 
> On Fri, Sep 3, 2010 at 12:13 PM, David Reiss <dreiss@facebook.com> wrote:
>>> Are users of fastbinary.c expected to use ASCII encoding exclusively?
>> No, you can use any encoding you want.  Just pass str objects into
>> Thrift, rather than Unicode objects.
>>
>> I wrote a patch to make it possible to *write* Unicode strings to fastbinary,
>> just not read them.  It's at https://issues.apache.org/jira/secure/attachment/12404198/0003-THRIFT-395.-python-Phase-Two-of-support-for-unicode.patch
>> Feel free to comment on that issue if you want that feature.
>>
>> --David
>>
>> On 09/03/2010 09:03 AM, Leo Kim wrote:
>>> Are users of fastbinary.c expected to use ASCII encoding exclusively? The code
generator has the option to force UTF8 encoding/decoding for strings, so  one could argue
that fastbinary.c should have an analog version that works only on UTF8 encoded strings.
>>>
>>> On Sep 2, 2010, at 11:49 PM, David Reiss <dreiss@facebook.com> wrote:
>>>
>>>> We don't actually use the "UTF8" field type.  That should probably be
>>>> removed entirely.  Unfortunately, there is currently no way for the
>>>> accelerator module to determine whether the user wants a given value
>>>> to be decoded into a unicode object or returned as a str.
>>>>
>>>> --David
>>>>
>>>> On 09/02/2010 07:26 PM, Leo Kim wrote:
>>>>> Hello,
>>>>>
>>>>> I didn't see utf8 support in fastbinary.c in thrift-0.4.0, so I hacked
>>>>> something in. I'm not a Python C API expert (nor a unicode expert),
>>>>> but the attached patch appears to work when sending utf8 encoded
>>>>> strings whereas without the patch I'd encounter the
>>>>> "UnicodeDecodeError: 'ascii' codec can't decode byte ..." error.
>>>>>
>>>>> I offer it to this mailing list for review as I'm interested in
>>>>> feedback regarding correctness and general interest in this patch.
>>>>>
>>>>> thx
>>>>> leo
>>>
>>
> 
> 
> 

Mime
View raw message