thrift-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jens Geyer" <jensge...@hotmail.com>
Subject Re: RE : Getting an Int64 from HBase using the Thrift npm
Date Tue, 18 Mar 2014 21:22:26 GMT
Great I could help.
In the meantime I found the related ticket:

https://issues.apache.org/jira/browse/THRIFT-1351



-----Ursprüngliche Nachricht----- 
From: Jérémie Pinard Saint-Pierre
Sent: Tuesday, March 18, 2014 9:16 PM
To: user@thrift.apache.org
Subject: RE : Getting an Int64 from HBase using the Thrift npm

Hello Jens,

I want to thank you for taking the time to answer me.

The issue that I encountered is actually only present when I generate the 
nodejs code for the 9.1.0 version of thrift in which the code block that you 
sent me contains:

if (ftype == Thrift.Type.STRING) {
        this.value = input.readString();
}

Following your advice, I generated the nodejs code for the trunk version of 
thrift and the code now reads binary data, which I can then interpret as I 
wish.

I will be using the trunk version of thrift to generate my bindings from now 
on.

Thanks again for everything
Jeremie
________________________________________
De : Jens Geyer [jensgeyer@hotmail.com]
Envoyé : 17 mars 2014 20:51
À : user@thrift.apache.org
Objet : Re: Getting an Int64 from HBase using the Thrift npm

Ok, one more (last for today) try:

>  var str = this.inBuf.toString('utf8', ...)

This of course decodes (not encodes) utf-8 encoded data from the buffer.

Using trunk, I generated the nodejs code for hbase.thrift and found no
reference to readString(), but readBinary() which is IMHO absolutely correct
here:

TCell.prototype.read = function(input) {
  input.readStructBegin();
  while (true)
  {
    var ret = input.readFieldBegin();
    var fname = ret.fname;
    var ftype = ret.ftype;
    var fid = ret.fid;
    if (ftype == Thrift.Type.STOP) {
      break;
    }
    switch (fid)
    {
      case 1:
      if (ftype == Thrift.Type.STRING) {
        this.value = input.readBinary();
      } else {
        input.skip(ftype);
      }
      break;
      case 2:
      if (ftype == Thrift.Type.I64) {
        this.timestamp = input.readI64();
      } else {
        input.skip(ftype);
      }
      break;
      default:
        input.skip(ftype);
    }
    input.readFieldEnd();
  }
  input.readStructEnd();
  return;
};

So what Thrift version are you using?

Furthermore, could it be related to THRIFT-1679?

Good night,
JensG


-----Ursprüngliche Nachricht-----
From: Jens Geyer
Sent: Tuesday, March 18, 2014 1:14 AM
To: user@thrift.apache.org
Subject: Re: Getting an Int64 from HBase using the Thrift npm

I'm probably wrong with what I said. I think, the problem is this:

>  var str = this.inBuf.toString('utf8', this.readCursor, this.readCursor +
> len);

You convert the data to UTF-8 here.

> Basically, anytime a 2-byte hexadecimal value is higher than \x7F, the
> value I get in javascript is 'ef bf bd'

The 0xEF is one of the utf-8 start sequence codes of a 3 byte sequence,
http://en.wikipedia.org/wiki/UTF-8

JensG




-----Ursprüngliche Nachricht-----
From: Jens Geyer
Sent: Tuesday, March 18, 2014 12:50 AM
To: user@thrift.apache.org
Subject: Re: Getting an Int64 from HBase using the Thrift npm

Sounds much like the problem found in
https://issues.apache.org/jira/browse/THRIFT-2336?focusedCommentId=13889924

Could that be the case?

That reminds me that we still need some fixes there ...



-----Ursprüngliche Nachricht-----
From: Jérémie Pinard Saint-Pierre
Sent: Monday, March 17, 2014 10:10 PM
To: user@thrift.apache.org
Subject: Getting an Int64 from HBase using the Thrift npm

Hello,

I am having some issues reading int64 values from an HBase table using
thrift from the nodejs thrift npm.

In HBase, a TCell is defined as containing two fields: an int64 timestamp
and a byte array called value.

However, when a TCell is read by the thrift npm in the
thrift/lib/thrift/transport.js file, it is interpreted as a utf8 string and
not a byte array and some values seem to get lost in the process:

readString: function(len) {
  this.ensureAvailable(len)
  var str = this.inBuf.toString('utf8', this.readCursor, this.readCursor +
len);
  this.readCursor += len;
  return str;
},

For example, when I look at my row in the HBase shell, I see

value=\x00\x00\x00\x00\x00\x01\xA6\x94

When I fetch it from nodejs, I get 00 00 00 00 00 01 ef bf bd ef bf bd

Basically, anytime a 2-byte hexadecimal value is higher than \x7F, the value
I get in javascript is 'ef bf bd'

Also, in the code snippet from transport.js, if I interpret the data stream
as binary instead of utf8, then the value is passed correctly to my code
var str = this.inBuf.toString('binary', this.readCursor, this.readCursor +
len);

I guess that making this change, however, would imply that the clients of
the npm module would need to cast their own strings on a per-column basis.

So I guess that my question is the following:

Even though the name of the function in transport.js is readString, it seems
to be used to read byte arrays( at least in the context of reading a TCell
from HBase ), is that right?

Also, is there any other way with which I could read an Int64 from HBase
using the thrift npm?

Thanks a lot
Jeremie
*********************************************************************** This
e-mail and attachments are confidential, legally privileged, may be subject
to copyright and sent solely for the attention of the addressee(s). Any
unauthorized use or disclosure is prohibited. Statements and opinions
expressed in this e-mail may not represent those of Radialpoint.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Le contenu du présent
courriel est confidentiel, privilégié et peut être soumis à des droits
d'auteur. Il est envoyé à l'intention exclusive de son ou de ses
destinataires. Il est interdit de l'utiliser ou de le divulguer sans
autorisation. Les opinions exprimées dans le présent courriel peuvent
diverger de celles de Radialpoint.

*********************************************************************** This 
e-mail and attachments are confidential, legally privileged, may be subject 
to copyright and sent solely for the attention of the addressee(s). Any 
unauthorized use or disclosure is prohibited. Statements and opinions 
expressed in this e-mail may not represent those of Radialpoint. 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Le contenu du présent 
courriel est confidentiel, privilégié et peut être soumis à des droits 
d'auteur. Il est envoyé à l'intention exclusive de son ou de ses 
destinataires. Il est interdit de l'utiliser ou de le divulguer sans 
autorisation. Les opinions exprimées dans le présent courriel peuvent 
diverger de celles de Radialpoint. 


Mime
View raw message