thrift-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jens Geyer" <jensge...@hotmail.com>
Subject Re: Getting an Int64 from HBase using the Thrift npm
Date Tue, 18 Mar 2014 00:51:42 GMT
Ok, one more (last for today) try:

>  var str = this.inBuf.toString('utf8', ...)

This of course decodes (not encodes) utf-8 encoded data from the buffer.

Using trunk, I generated the nodejs code for hbase.thrift and found no 
reference to readString(), but readBinary() which is IMHO absolutely correct 
here:

TCell.prototype.read = function(input) {
  input.readStructBegin();
  while (true)
  {
    var ret = input.readFieldBegin();
    var fname = ret.fname;
    var ftype = ret.ftype;
    var fid = ret.fid;
    if (ftype == Thrift.Type.STOP) {
      break;
    }
    switch (fid)
    {
      case 1:
      if (ftype == Thrift.Type.STRING) {
        this.value = input.readBinary();
      } else {
        input.skip(ftype);
      }
      break;
      case 2:
      if (ftype == Thrift.Type.I64) {
        this.timestamp = input.readI64();
      } else {
        input.skip(ftype);
      }
      break;
      default:
        input.skip(ftype);
    }
    input.readFieldEnd();
  }
  input.readStructEnd();
  return;
};

So what Thrift version are you using?

Furthermore, could it be related to THRIFT-1679?

Good night,
JensG


-----Ursprüngliche Nachricht----- 
From: Jens Geyer
Sent: Tuesday, March 18, 2014 1:14 AM
To: user@thrift.apache.org
Subject: Re: Getting an Int64 from HBase using the Thrift npm

I'm probably wrong with what I said. I think, the problem is this:

>  var str = this.inBuf.toString('utf8', this.readCursor, this.readCursor + 
> len);

You convert the data to UTF-8 here.

> Basically, anytime a 2-byte hexadecimal value is higher than \x7F, the 
> value I get in javascript is 'ef bf bd'

The 0xEF is one of the utf-8 start sequence codes of a 3 byte sequence,
http://en.wikipedia.org/wiki/UTF-8

JensG




-----Ursprüngliche Nachricht----- 
From: Jens Geyer
Sent: Tuesday, March 18, 2014 12:50 AM
To: user@thrift.apache.org
Subject: Re: Getting an Int64 from HBase using the Thrift npm

Sounds much like the problem found in
https://issues.apache.org/jira/browse/THRIFT-2336?focusedCommentId=13889924

Could that be the case?

That reminds me that we still need some fixes there ...



-----Ursprüngliche Nachricht----- 
From: Jérémie Pinard Saint-Pierre
Sent: Monday, March 17, 2014 10:10 PM
To: user@thrift.apache.org
Subject: Getting an Int64 from HBase using the Thrift npm

Hello,

I am having some issues reading int64 values from an HBase table using
thrift from the nodejs thrift npm.

In HBase, a TCell is defined as containing two fields: an int64 timestamp
and a byte array called value.

However, when a TCell is read by the thrift npm in the
thrift/lib/thrift/transport.js file, it is interpreted as a utf8 string and
not a byte array and some values seem to get lost in the process:

readString: function(len) {
  this.ensureAvailable(len)
  var str = this.inBuf.toString('utf8', this.readCursor, this.readCursor +
len);
  this.readCursor += len;
  return str;
},

For example, when I look at my row in the HBase shell, I see

value=\x00\x00\x00\x00\x00\x01\xA6\x94

When I fetch it from nodejs, I get 00 00 00 00 00 01 ef bf bd ef bf bd

Basically, anytime a 2-byte hexadecimal value is higher than \x7F, the value
I get in javascript is 'ef bf bd'

Also, in the code snippet from transport.js, if I interpret the data stream
as binary instead of utf8, then the value is passed correctly to my code
var str = this.inBuf.toString('binary', this.readCursor, this.readCursor +
len);

I guess that making this change, however, would imply that the clients of
the npm module would need to cast their own strings on a per-column basis.

So I guess that my question is the following:

Even though the name of the function in transport.js is readString, it seems
to be used to read byte arrays( at least in the context of reading a TCell
from HBase ), is that right?

Also, is there any other way with which I could read an Int64 from HBase
using the thrift npm?

Thanks a lot
Jeremie
*********************************************************************** This
e-mail and attachments are confidential, legally privileged, may be subject
to copyright and sent solely for the attention of the addressee(s). Any
unauthorized use or disclosure is prohibited. Statements and opinions
expressed in this e-mail may not represent those of Radialpoint.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Le contenu du présent
courriel est confidentiel, privilégié et peut être soumis à des droits
d'auteur. Il est envoyé à l'intention exclusive de son ou de ses
destinataires. Il est interdit de l'utiliser ou de le divulguer sans
autorisation. Les opinions exprimées dans le présent courriel peuvent
diverger de celles de Radialpoint.


Mime
View raw message