spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Antony Mayi <antonym...@yahoo.com.INVALID>
Subject Re: custom python converter from HBase Result to tuple
Date Tue, 23 Dec 2014 04:04:29 GMT
using hbase 0.98.6
there is no stack trace, just this short error.
just noticed it does the fallback to toString as in the message as this is what I get back
to python:

hbase_rdd.collect()
[(u'key1', u'List(cf1:12345:14567890, cf2:123:14567896)')]
so the question is why it falls back to toString?
thanks,Antony.
 

     On Monday, 22 December 2014, 20:09, Ted Yu <yuzhihong@gmail.com> wrote:
   
 

 Which HBase version are you using ?
Can you show the full stack trace ?
Cheers
On Mon, Dec 22, 2014 at 11:02 AM, Antony Mayi <antonymayi@yahoo.com.invalid> wrote:

Hi,

can anyone please give me some help how to write custom converter of hbase data to (for example)
tuples of ((family, qualifier, value), ) for pyspark:

I was trying something like (here trying to tuples of ("family:qualifier:value", )):


class HBaseResultToTupleConverter extends Converter[Any, List[String]] {
  override def convert(obj: Any): List[String] = {
    val result = obj.asInstanceOf[Result]
    result.rawCells().map(cell => List(Bytes.toString(CellUtil.cloneFamily(cell)),
      Bytes.toString(CellUtil.cloneQualifier(cell)),
      Bytes.toString(CellUtil.cloneValue(cell))).mkString(":")
    ).toList
  }
}


but then I get a error:

14/12/22 16:27:40 WARN python.SerDeUtil:
Failed to pickle Java object as value: $colon$colon, falling back
to 'toString'. Error: couldn't introspect javabean: java.lang.IllegalArgumentException: wrong
number of arguments


does anyone have a hint?

Thanks,
Antony.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org





 
   
Mime
View raw message