spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <so...@cloudera.com>
Subject Re: Spark Streaming into HBase
Date Wed, 03 Sep 2014 21:42:03 GMT
This doesn't seem to have to do with HBase per se. Some function is
getting the StreamingContext into the closure and that won't work. Is
this exactly the code? since it doesn't reference a StreamingContext,
but is there maybe a different version in reality that tries to use
StreamingContext inside a function?

On Wed, Sep 3, 2014 at 10:36 PM, Ted Yu <yuzhihong@gmail.com> wrote:
> Adding back user@
>
> I am not familiar with the NotSerializableException. Can you show the full
> stack trace ?
>
> See SPARK-1297 for changes you need to make so that Spark works with hbase
> 0.98
>
> Cheers
>
>
> On Wed, Sep 3, 2014 at 2:33 PM, Kevin Peng <kpeng1@gmail.com> wrote:
>>
>> Ted,
>>
>> The hbase-site.xml is in the classpath (had worse issues before... until I
>> figured that it wasn't in the path).
>>
>> I get the following error in the spark-shell:
>> org.apache.spark.SparkException: Job aborted due to stage failure: Task
>> not serializable: java.io.NotSerializableException:
>> org.apache.spark.streaming.StreamingContext
>>         at
>> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.sc
>> ...
>>
>> I also double checked the hbase table, just in case, and nothing new is
>> written in there.
>>
>> I am using hbase version: 0.98.1-cdh5.1.0 the default one with the
>> CDH5.1.0 distro.
>>
>> Thank you for the help.
>>
>>
>> On Wed, Sep 3, 2014 at 2:09 PM, Ted Yu <yuzhihong@gmail.com> wrote:
>>>
>>> Is hbase-site.xml in the classpath ?
>>> Do you observe any exception from the code below or in region server log
>>> ?
>>>
>>> Which hbase release are you using ?
>>>
>>>
>>> On Wed, Sep 3, 2014 at 2:05 PM, kpeng1 <kpeng1@gmail.com> wrote:
>>>>
>>>> I have been trying to understand how spark streaming and hbase connect,
>>>> but
>>>> have not been successful. What I am trying to do is given a spark
>>>> stream,
>>>> process that stream and store the results in an hbase table. So far this
>>>> is
>>>> what I have:
>>>>
>>>> import org.apache.spark.SparkConf
>>>> import org.apache.spark.streaming.{Seconds, StreamingContext}
>>>> import org.apache.spark.streaming.StreamingContext._
>>>> import org.apache.spark.storage.StorageLevel
>>>> import org.apache.hadoop.hbase.HBaseConfiguration
>>>> import org.apache.hadoop.hbase.client.{HBaseAdmin,HTable,Put,Get}
>>>> import org.apache.hadoop.hbase.util.Bytes
>>>>
>>>> def blah(row: Array[String]) {
>>>>   val hConf = new HBaseConfiguration()
>>>>   val hTable = new HTable(hConf, "table")
>>>>   val thePut = new Put(Bytes.toBytes(row(0)))
>>>>   thePut.add(Bytes.toBytes("cf"), Bytes.toBytes(row(0)),
>>>> Bytes.toBytes(row(0)))
>>>>   hTable.put(thePut)
>>>> }
>>>>
>>>> val ssc = new StreamingContext(sc, Seconds(1))
>>>> val lines = ssc.socketTextStream("localhost", 9999,
>>>> StorageLevel.MEMORY_AND_DISK_SER)
>>>> val words = lines.map(_.split(","))
>>>> val store = words.foreachRDD(rdd => rdd.foreach(blah))
>>>> ssc.start()
>>>>
>>>> I am currently running the above code in spark-shell. I am not sure what
>>>> I
>>>> am doing wrong.
>>>>
>>>>
>>>>
>>>> --
>>>> View this message in context:
>>>> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-into-HBase-tp13378.html
>>>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>>>> For additional commands, e-mail: user-help@spark.apache.org
>>>>
>>>
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message