hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Owen O'Malley (JIRA)" <j...@apache.org>
Subject [jira] Resolved: (HADOOP-6883) Text.toString violates its abstraction
Date Tue, 27 Jul 2010 15:17:19 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-6883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Owen O'Malley resolved HADOOP-6883.

    Resolution: Invalid

The proper call is:

b64.decode(val.getBytes(), 0, val.getLength());

Yes, it is confusing, but doing anything else would not perform acceptably. If you look at
the javadoc for getBytes(), you'll see why your call fails.

> Text.toString violates its abstraction
> --------------------------------------
>                 Key: HADOOP-6883
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6883
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 0.20.1
>         Environment: Linux
>            Reporter: Gordon Sommers
> I stumbled upon this when encoding a google protocol buffer in base64, and storing it
in a Text object for serialization. Compare the following two lines:
> byte [] decoded = b64.decode(val.getBytes())
> //this does not return the same bytes as below and the result, after decoding the base64
successfully, is a very mangled protocol buffer
> byte [] decoded = b64.decode(val.toString().getBytes());
> //YES, toString() FIXES IT
> Elsewhere in my code I also have: 
> Text curline = new Text(values.next().toString());
> byte [] raw = base64.decode(curline.getBytes());
> //This does work.
> It looks like the Text object must be toString'd (just once, somewhere, even if its later
repacked in a Text) before it will have the proper byte representation. I would classify this
as a leaky abstraction and ask that the reason please be isolated and the api fixed somehow
so that other developers dont have to spend 3 days figuring out when Text.getBytes isn't returning
the right bytes even though Text.toString prints exactly the right string representation and
Text.toString.getBytes does return the right bytes.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message