spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ashish Shrowty <ashish.shro...@gmail.com>
Subject Re: Spark shell and StackOverFlowError
Date Sun, 30 Aug 2015 15:54:21 GMT
@Sean - Agree that there is no action, but I still get the
stackoverflowerror, its very weird

@Ted - Variable a is just an int - val a = 10 ... The error happens when I
try to pass a variable into the closure. The example you have above works
fine since there is no variable being passed into the closure from the
shell.

-Ashish

On Sun, Aug 30, 2015 at 9:55 AM Ted Yu <yuzhihong@gmail.com> wrote:

> Using Spark shell :
>
> scala> import scala.collection.mutable.MutableList
> import scala.collection.mutable.MutableList
>
> scala> val lst = MutableList[(String,String,Double)]()
> lst: scala.collection.mutable.MutableList[(String, String, Double)] =
> MutableList()
>
> scala> Range(0,10000).foreach(i=>lst+=(("10","10",i:Double)))
>
> scala> val rdd=sc.makeRDD(lst).map(i=> if(a==10) 1 else 0)
> <console>:27: error: not found: value a
>        val rdd=sc.makeRDD(lst).map(i=> if(a==10) 1 else 0)
>                                           ^
>
> scala> val rdd=sc.makeRDD(lst).map(i=> if(i._1==10) 1 else 0)
> rdd: org.apache.spark.rdd.RDD[Int] = MapPartitionsRDD[1] at map at
> <console>:27
>
> scala> rdd.count()
> ...
> 15/08/30 06:53:40 INFO DAGScheduler: Job 0 finished: count at
> <console>:30, took 0.478350 s
> res1: Long = 10000
>
> Ashish:
> Please refine your example to mimic more closely what your code actually
> did.
>
> Thanks
>
> On Sun, Aug 30, 2015 at 12:24 AM, Sean Owen <sowen@cloudera.com> wrote:
>
>> That can't cause any error, since there is no action in your first
>> snippet. Even calling count on the result doesn't cause an error. You
>> must be executing something different.
>>
>> On Sun, Aug 30, 2015 at 4:21 AM, ashrowty <ashish.shrowty@gmail.com>
>> wrote:
>> > I am running the Spark shell (1.2.1) in local mode and I have a simple
>> > RDD[(String,String,Double)] with about 10,000 objects in it. I get a
>> > StackOverFlowError each time I try to run the following code (the code
>> > itself is just representative of other logic where I need to pass in a
>> > variable). I tried broadcasting the variable too, but no luck .. missing
>> > something basic here -
>> >
>> > val rdd = sc.makeRDD(List(<Data read from file>)
>> > val a=10
>> > rdd.map(r => if (a==10) 1 else 0)
>> > This throws -
>> >
>> > java.lang.StackOverflowError
>> >     at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:318)
>> >     at
>> java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1133)
>> >     at
>> >
>> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
>> >     at
>> > java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
>> >     at
>> >
>> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
>> >     at
>> java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
>> >     at
>> >
>> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
>> >     at
>> > java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
>> >     at
>> >
>> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
>> > ...
>> > ...
>> >
>> > More experiments  .. this works -
>> >
>> > val lst = Range(0,10000).map(i=>("10","10",i:Double)).toList
>> > sc.makeRDD(lst).map(i=> if(a==10) 1 else 0)
>> >
>> > But below doesn't and throws the StackoverflowError -
>> >
>> > val lst = MutableList[(String,String,Double)]()
>> > Range(0,10000).foreach(i=>lst+=(("10","10",i:Double)))
>> > sc.makeRDD(lst).map(i=> if(a==10) 1 else 0)
>> >
>> > Any help appreciated!
>> >
>> > Thanks,
>> > Ashish
>> >
>> >
>> >
>> > --
>> > View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-shell-and-StackOverFlowError-tp24508.html
>> > Sent from the Apache Spark User List mailing list archive at Nabble.com.
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>> > For additional commands, e-mail: user-help@spark.apache.org
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>> For additional commands, e-mail: user-help@spark.apache.org
>>
>>
>

Mime
View raw message