spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ian O'Connell" <...@ianoconnell.com>
Subject Re: Spark variable init problem
Date Wed, 07 Aug 2013 15:30:05 GMT
do you have any function inside the object?

following a code layout like...
https://github.com/mesos/spark/blob/master/examples/src/main/scala/spark/examples/SparkLR.scala?


On Wed, Aug 7, 2013 at 8:17 AM, Han JU <ju.han.felix@gmail.com> wrote:

> Thanks first.
>
> It's a scala Object extending App.
>
>
> 2013/8/7 Ian O'Connell <ian@ianoconnell.com>
>
>> is your code is probably part of an object? the closure cleaner doesn't
>> attempt to pull in parts of objects.
>>
>> What does the code around your sketched section look like?
>>
>>
>> On Wed, Aug 7, 2013 at 6:47 AM, Han JU <ju.han.felix@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> I'm just putting my hands on Spark and I wrote a simple job in scala.
>>> It sketches like:
>>>
>>> val TAB = "\t"
>>>
>>> val support = 2
>>>
>>> val sc = new SparkContext(...)
>>>
>>> val raw = sc.textFile(...)
>>>
>>> val filtered = raw.map(
>>>
>>>   line => {
>>>
>>>     val lineSplit = line.split(TAB) // TAB is null and exception is
>>> thrown during the run
>>>     ...
>>>
>>>   }).filter( p => p._2 >= support) // support here is 0 during the run
>>>
>>> ...
>>>
>>> I run the sbt-assembly jar like "java -cp ..." on a standalone cluster,
>>> I found out that when referenced in the RDD transformation, the 2 values,
>>> TAB and support, are set to their default values. So TAB is null, and
>>> support is 0 and no longer "\t" and 2 as they are initialized above.
>>>
>>> If the same jar is run locally (MASTER is local or local[k] instead of
>>> spark://...) on the same input, it runs perfectly. The code also runs well
>>> in spark-shell on cluster.
>>>
>>> For the jar to run correctly on cluster, I have to hard code the string
>>> literal and the number in the RDD transformation part.
>>>
>>> It really seems to me a weird bug, maybe it has something to do with the
>>> sbt-assembly jar compilation? Some suggestions?
>>>
>>> Thanks.
>>>
>>> I'm using spark version 0.7.3 and scala 2.9.3.
>>>
>>> --
>>> *JU Han*
>>>
>>> Software Engineer Intern @ KXEN Inc.
>>> UTC   -  Université de Technologie de Compiègne
>>> *     **GI06 - Fouille de Données et Décisionnel*
>>>
>>> +33 0619608888
>>>
>>
>>
>
>
> --
> *JU Han*
>
> Software Engineer Intern @ KXEN Inc.
> UTC   -  Université de Technologie de Compiègne
> *     **GI06 - Fouille de Données et Décisionnel*
>
> +33 0619608888
>

Mime
View raw message