Do you think I should create a JIRA?


On Sun, Aug 30, 2015 at 12:56 PM Ted Yu <yuzhihong@gmail.com> wrote:
I got StackOverFlowError as well :-(

On Sun, Aug 30, 2015 at 9:47 AM, Ashish Shrowty <ashish.shrowty@gmail.com> wrote:
Yep .. I tried that too earlier. Doesn't make a difference. Are you able to replicate on your side?


On Sun, Aug 30, 2015 at 12:08 PM Ted Yu <yuzhihong@gmail.com> wrote:
I see.

What about using the following in place of variable a ?

Cheers

On Sun, Aug 30, 2015 at 8:54 AM, Ashish Shrowty <ashish.shrowty@gmail.com> wrote:
@Sean - Agree that there is no action, but I still get the stackoverflowerror, its very weird

@Ted - Variable a is just an int - val a = 10 ... The error happens when I try to pass a variable into the closure. The example you have above works fine since there is no variable being passed into the closure from the shell.

-Ashish

On Sun, Aug 30, 2015 at 9:55 AM Ted Yu <yuzhihong@gmail.com> wrote:
Using Spark shell :

scala> import scala.collection.mutable.MutableList
import scala.collection.mutable.MutableList

scala> val lst = MutableList[(String,String,Double)]()
lst: scala.collection.mutable.MutableList[(String, String, Double)] = MutableList()

scala> Range(0,10000).foreach(i=>lst+=(("10","10",i:Double)))

scala> val rdd=sc.makeRDD(lst).map(i=> if(a==10) 1 else 0)
<console>:27: error: not found: value a
       val rdd=sc.makeRDD(lst).map(i=> if(a==10) 1 else 0)
                                          ^

scala> val rdd=sc.makeRDD(lst).map(i=> if(i._1==10) 1 else 0)
rdd: org.apache.spark.rdd.RDD[Int] = MapPartitionsRDD[1] at map at <console>:27

scala> rdd.count()
...
15/08/30 06:53:40 INFO DAGScheduler: Job 0 finished: count at <console>:30, took 0.478350 s
res1: Long = 10000

Ashish:
Please refine your example to mimic more closely what your code actually did.

Thanks

On Sun, Aug 30, 2015 at 12:24 AM, Sean Owen <sowen@cloudera.com> wrote:
That can't cause any error, since there is no action in your first
snippet. Even calling count on the result doesn't cause an error. You
must be executing something different.

On Sun, Aug 30, 2015 at 4:21 AM, ashrowty <ashish.shrowty@gmail.com> wrote:
> I am running the Spark shell (1.2.1) in local mode and I have a simple
> RDD[(String,String,Double)] with about 10,000 objects in it. I get a
> StackOverFlowError each time I try to run the following code (the code
> itself is just representative of other logic where I need to pass in a
> variable). I tried broadcasting the variable too, but no luck .. missing
> something basic here -
>
> val rdd = sc.makeRDD(List(<Data read from file>)
> val a=10
> rdd.map(r => if (a==10) 1 else 0)
> This throws -
>
> java.lang.StackOverflowError
>     at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:318)
>     at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1133)
>     at
> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
>     at
> java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
>     at
> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
>     at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
>     at
> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
>     at
> java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
>     at
> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
> ...
> ...
>
> More experiments  .. this works -
>
> val lst = Range(0,10000).map(i=>("10","10",i:Double)).toList
> sc.makeRDD(lst).map(i=> if(a==10) 1 else 0)
>
> But below doesn't and throws the StackoverflowError -
>
> val lst = MutableList[(String,String,Double)]()
> Range(0,10000).foreach(i=>lst+=(("10","10",i:Double)))
> sc.makeRDD(lst).map(i=> if(a==10) 1 else 0)
>
> Any help appreciated!
>
> Thanks,
> Ashish
>
>
>
> --
> View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-shell-and-StackOverFlowError-tp24508.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org