spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From patcharee <Patcharee.Thong...@uni.no>
Subject Re: hiveContext.sql NullPointerException
Date Sun, 07 Jun 2015 13:48:23 GMT
Hi,

How can I expect to work on HiveContext on the executor? If only the 
driver can see HiveContext, does it mean I have to collect all datasets 
(very large) to the driver and use HiveContext there? It will be memory 
overload on the driver and fail.

BR,
Patcharee

On 07. juni 2015 11:51, Cheng Lian wrote:
> Hi,
>
> This is expected behavior. HiveContext.sql (and also 
> DataFrame.registerTempTable) is only expected to be invoked on driver 
> side. However, the closure passed to RDD.foreach is executed on 
> executor side, where no viable HiveContext instance exists.
>
> Cheng
>
> On 6/7/15 10:06 AM, patcharee wrote:
>> Hi,
>>
>> I try to insert data into a partitioned hive table. The groupByKey is 
>> to combine dataset into a partition of the hive table. After the 
>> groupByKey, I converted the iterable[X] to DB by X.toList.toDF(). But 
>> the hiveContext.sql  throws NullPointerException, see below. Any 
>> suggestions? What could be wrong? Thanks!
>>
>> val varWHeightFlatRDD = 
>> varWHeightRDD.flatMap(FlatMapUtilClass().flatKeyFromWrf).groupByKey()
>>       .foreach(
>>         x => {
>>           val zone = x._1._1
>>           val z = x._1._2
>>           val year = x._1._3
>>           val month = x._1._4
>>           val df_table_4dim = x._2.toList.toDF()
>>           df_table_4dim.registerTempTable("table_4Dim")
>>           hiveContext.sql("INSERT OVERWRITE table 4dim partition 
>> (zone=" + zone + ",z=" + z + ",year=" + year + ",month=" + month + ") 
>> " +
>>             "select date, hh, x, y, height, u, v, w, ph, phb, t, p, 
>> pb, qvapor, qgraup, qnice, qnrain, tke_pbl, el_pbl from table_4Dim");
>>
>> })
>>
>>
>> java.lang.NullPointerException
>>     at org.apache.spark.sql.hive.HiveContext.sql(HiveContext.scala:100)
>>     at 
>> no.uni.computing.etl.LoadWrfIntoHiveOptReduce1$$anonfun$7.apply(LoadWrfIntoHiveOptReduce1.scala:113)
>>     at 
>> no.uni.computing.etl.LoadWrfIntoHiveOptReduce1$$anonfun$7.apply(LoadWrfIntoHiveOptReduce1.scala:103)
>>     at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>>     at 
>> org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28)
>>     at org.apache.spark.rdd.RDD$$anonfun$foreach$1.apply(RDD.scala:798)
>>     at org.apache.spark.rdd.RDD$$anonfun$foreach$1.apply(RDD.scala:798)
>>     at 
>> org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1511)
>>     at 
>> org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1511)
>>     at 
>> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
>>     at org.apache.spark.scheduler.Task.run(Task.scala:64)
>>     at 
>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)
>>     at 
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>     at 
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>     at java.lang.Thread.run(Thread.java:744)
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>> For additional commands, e-mail: user-help@spark.apache.org
>>
>>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message