From user-return-3731-apmail-thrift-user-archive=thrift.apache.org@thrift.apache.org Tue Mar 18 20:17:29 2014 Return-Path: X-Original-To: apmail-thrift-user-archive@www.apache.org Delivered-To: apmail-thrift-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0082410035 for ; Tue, 18 Mar 2014 20:17:29 +0000 (UTC) Received: (qmail 75985 invoked by uid 500); 18 Mar 2014 20:17:27 -0000 Delivered-To: apmail-thrift-user-archive@thrift.apache.org Received: (qmail 75619 invoked by uid 500); 18 Mar 2014 20:17:26 -0000 Mailing-List: contact user-help@thrift.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@thrift.apache.org Delivered-To: mailing list user@thrift.apache.org Received: (qmail 75609 invoked by uid 99); 18 Mar 2014 20:17:24 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 18 Mar 2014 20:17:24 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of Jeremie.Saint-Pierre@radialpoint.com designates 173.241.44.15 as permitted sender) Received: from [173.241.44.15] (HELO mailhost.radialpoint.com) (173.241.44.15) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 18 Mar 2014 20:17:18 +0000 Received: from yul01relay01.ops.dot (yul01relay01.ops.dot [10.0.31.11]) by mailhost.radialpoint.com (Postfix) with ESMTP id A55A43B2 for ; Tue, 18 Mar 2014 16:16:56 -0400 (EDT) Received: from YUL01WHUB01.rp.corp (unknown [10.0.30.26]) by yul01relay01.ops.dot (Postfix) with ESMTP id 1FB3A84DD5 for ; Tue, 18 Mar 2014 16:17:20 -0400 (EDT) Received: from YUL01WMXB01.rp.corp ([fe80::35cd:6b99:62e2:59b1]) by YUL01WHUB01.rp.corp ([::1]) with mapi id 14.03.0174.001; Tue, 18 Mar 2014 16:16:56 -0400 From: =?iso-8859-1?Q?J=E9r=E9mie_Pinard_Saint-Pierre?= To: "user@thrift.apache.org" Subject: RE : Getting an Int64 from HBase using the Thrift npm Thread-Topic: Getting an Int64 from HBase using the Thrift npm Thread-Index: Ac9CJVBk3OuNa34vS6OkNHtvBunQRQAN+wGAAADU8AAAAUxMAAAgJ9oh Date: Tue, 18 Mar 2014 20:16:55 +0000 Message-ID: <1B2D8F2EACA7C0429ADBC6BE578E20A81AD7DFDD@YUL01WMXB01.rp.corp> References: <1B2D8F2EACA7C0429ADBC6BE578E20A81AD7D856@YUL01WMXB01.rp.corp> , In-Reply-To: Accept-Language: fr-CA, en-CA, en-US Content-Language: fr-CA X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.0.99.56] x-tm-as-product-ver: SMEX-10.2.0.3176-7.500.1017-20574.001 x-tm-as-result: No--52.035600-8.000000-31 x-tm-as-user-approved-sender: No x-tm-as-user-blocked-sender: No Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Virus-Checked: Checked by ClamAV on apache.org Hello Jens, I want to thank you for taking the time to answer me. The issue that I encountered is actually only present when I generate the n= odejs code for the 9.1.0 version of thrift in which the code block that you= sent me contains: if (ftype =3D=3D Thrift.Type.STRING) { this.value =3D input.readString(); } Following your advice, I generated the nodejs code for the trunk version of= thrift and the code now reads binary data, which I can then interpret as I= wish. I will be using the trunk version of thrift to generate my bindings from no= w on. Thanks again for everything Jeremie ________________________________________ De : Jens Geyer [jensgeyer@hotmail.com] Envoy=E9 : 17 mars 2014 20:51 =C0 : user@thrift.apache.org Objet : Re: Getting an Int64 from HBase using the Thrift npm Ok, one more (last for today) try: > var str =3D this.inBuf.toString('utf8', ...) This of course decodes (not encodes) utf-8 encoded data from the buffer. Using trunk, I generated the nodejs code for hbase.thrift and found no reference to readString(), but readBinary() which is IMHO absolutely correc= t here: TCell.prototype.read =3D function(input) { input.readStructBegin(); while (true) { var ret =3D input.readFieldBegin(); var fname =3D ret.fname; var ftype =3D ret.ftype; var fid =3D ret.fid; if (ftype =3D=3D Thrift.Type.STOP) { break; } switch (fid) { case 1: if (ftype =3D=3D Thrift.Type.STRING) { this.value =3D input.readBinary(); } else { input.skip(ftype); } break; case 2: if (ftype =3D=3D Thrift.Type.I64) { this.timestamp =3D input.readI64(); } else { input.skip(ftype); } break; default: input.skip(ftype); } input.readFieldEnd(); } input.readStructEnd(); return; }; So what Thrift version are you using? Furthermore, could it be related to THRIFT-1679? Good night, JensG -----Urspr=FCngliche Nachricht----- From: Jens Geyer Sent: Tuesday, March 18, 2014 1:14 AM To: user@thrift.apache.org Subject: Re: Getting an Int64 from HBase using the Thrift npm I'm probably wrong with what I said. I think, the problem is this: > var str =3D this.inBuf.toString('utf8', this.readCursor, this.readCursor= + > len); You convert the data to UTF-8 here. > Basically, anytime a 2-byte hexadecimal value is higher than \x7F, the > value I get in javascript is 'ef bf bd' The 0xEF is one of the utf-8 start sequence codes of a 3 byte sequence, http://en.wikipedia.org/wiki/UTF-8 JensG -----Urspr=FCngliche Nachricht----- From: Jens Geyer Sent: Tuesday, March 18, 2014 12:50 AM To: user@thrift.apache.org Subject: Re: Getting an Int64 from HBase using the Thrift npm Sounds much like the problem found in https://issues.apache.org/jira/browse/THRIFT-2336?focusedCommentId=3D138899= 24 Could that be the case? That reminds me that we still need some fixes there ... -----Urspr=FCngliche Nachricht----- From: J=E9r=E9mie Pinard Saint-Pierre Sent: Monday, March 17, 2014 10:10 PM To: user@thrift.apache.org Subject: Getting an Int64 from HBase using the Thrift npm Hello, I am having some issues reading int64 values from an HBase table using thrift from the nodejs thrift npm. In HBase, a TCell is defined as containing two fields: an int64 timestamp and a byte array called value. However, when a TCell is read by the thrift npm in the thrift/lib/thrift/transport.js file, it is interpreted as a utf8 string and not a byte array and some values seem to get lost in the process: readString: function(len) { this.ensureAvailable(len) var str =3D this.inBuf.toString('utf8', this.readCursor, this.readCursor = + len); this.readCursor +=3D len; return str; }, For example, when I look at my row in the HBase shell, I see value=3D\x00\x00\x00\x00\x00\x01\xA6\x94 When I fetch it from nodejs, I get 00 00 00 00 00 01 ef bf bd ef bf bd Basically, anytime a 2-byte hexadecimal value is higher than \x7F, the valu= e I get in javascript is 'ef bf bd' Also, in the code snippet from transport.js, if I interpret the data stream as binary instead of utf8, then the value is passed correctly to my code var str =3D this.inBuf.toString('binary', this.readCursor, this.readCursor = + len); I guess that making this change, however, would imply that the clients of the npm module would need to cast their own strings on a per-column basis. So I guess that my question is the following: Even though the name of the function in transport.js is readString, it seem= s to be used to read byte arrays( at least in the context of reading a TCell from HBase ), is that right? Also, is there any other way with which I could read an Int64 from HBase using the thrift npm? Thanks a lot Jeremie *********************************************************************** Thi= s e-mail and attachments are confidential, legally privileged, may be subject to copyright and sent solely for the attention of the addressee(s). Any unauthorized use or disclosure is prohibited. Statements and opinions expressed in this e-mail may not represent those of Radialpoint. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Le contenu du pr=E9sent courriel est confidentiel, privil=E9gi=E9 et peut =EAtre soumis =E0 des dro= its d'auteur. Il est envoy=E9 =E0 l'intention exclusive de son ou de ses destinataires. Il est interdit de l'utiliser ou de le divulguer sans autorisation. Les opinions exprim=E9es dans le pr=E9sent courriel peuvent diverger de celles de Radialpoint. *********************************************************************** Thi= s e-mail and attachments are confidential, legally privileged, may be subje= ct to copyright and sent solely for the attention of the addressee(s). Any = unauthorized use or disclosure is prohibited. Statements and opinions expre= ssed in this e-mail may not represent those of Radialpoint. ~~~~~~~~~~~~~~~= ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Le contenu du pr=E9sent courriel est co= nfidentiel, privil=E9gi=E9 et peut =EAtre soumis =E0 des droits d'auteur. I= l est envoy=E9 =E0 l'intention exclusive de son ou de ses destinataires. Il= est interdit de l'utiliser ou de le divulguer sans autorisation. Les opini= ons exprim=E9es dans le pr=E9sent courriel peuvent diverger de celles de Ra= dialpoint.