spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Castberg, René Christian <Rene.Castb...@dnvgl.com>
Subject SV: Pyspark Hbase scan.
Date Fri, 13 Mar 2015 06:14:29 GMT
?Sorry forgot to attach traceback.


Regards


Rene Castberg

________________________________
Fra: Castberg, René Christian
Sendt: 13. mars 2015 07:13
Til: user@spark.apache.org
Kopi: gen tang
Emne: SV: Pyspark Hbase scan.


?Hi,


I have now successfully managed to test this in a local spark session.

But i am having a huge programming getting this to work with Horton Works technical preview.
 I think that there is an incompatability with the way YARN has been compiled.


After changing the hbase version, and adding:

resolvers += "Hortonworks Releases" at "http://repo.hortonworks.com/content/repositories/releases/"


I get the attached traceback.


Any help in how to compile this jar such that it works would be greatly appreciated.


Regards


Rene Castberg


________________________________
Fra: gen tang <gen.tang86@gmail.com>
Sendt: 5. februar 2015 11:38
Til: Castberg, René Christian
Kopi: user@spark.apache.org
Emne: Re: Pyspark Hbase scan.

Hi,

In fact, this pull https://github.com/apache/spark/pull/3920 is to do Hbase scan. However,
it is not merged yet.
You can also take a look at the example code at http://spark-packages.org/package/20 which
is using scala and python to read data from hbase.

Hope this can be helpful.

Cheers
Gen



On Thu, Feb 5, 2015 at 11:11 AM, Castberg, René Christian <Rene.Castberg@dnvgl.com<mailto:Rene.Castberg@dnvgl.com>>
wrote:
?Hi,

I am trying to do a hbase scan and read it into a spark rdd using pyspark. I have successfully
written data to hbase from pyspark, and been able to read a full table from hbase using the
python example code. Unfortunately I am unable to find any example code for doing an HBase
scan and read it into a spark rdd from pyspark.

I have found a scala example :
http://stackoverflow.com/questions/25189527/how-to-process-a-range-of-hbase-rows-using-spark

But i can't find anything on how to do this from python. Can anybody shed some light on how
(and if) this can be done??

Regards

Rene Castberg?


**************************************************************************************
This e-mail and any attachments thereto may contain confidential information and/or information
protected by intellectual property rights for the exclusive attention of the intended addressees
named above. If you have received this transmission in error, please immediately notify the
sender by return e-mail and delete this message and its attachments. Unauthorized use, copying
or further full or partial distribution of this e-mail or its contents is prohibited.
**************************************************************************************


**************************************************************************************
This e-mail and any attachments thereto may contain confidential information and/or information
protected by intellectual property rights for the exclusive attention of the intended addressees
named above. If you have received this transmission in error, please immediately notify the
sender by return e-mail and delete this message and its attachments. Unauthorized use, copying
or further full or partial distribution of this e-mail or its contents is prohibited.
**************************************************************************************

Mime
View raw message