spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Han JU <ju.han.fe...@gmail.com>
Subject Re: Spark variable init problem
Date Wed, 07 Aug 2013 15:17:07 GMT
Thanks first.

It's a scala Object extending App.


2013/8/7 Ian O'Connell <ian@ianoconnell.com>

> is your code is probably part of an object? the closure cleaner doesn't
> attempt to pull in parts of objects.
>
> What does the code around your sketched section look like?
>
>
> On Wed, Aug 7, 2013 at 6:47 AM, Han JU <ju.han.felix@gmail.com> wrote:
>
>> Hi,
>>
>> I'm just putting my hands on Spark and I wrote a simple job in scala.
>> It sketches like:
>>
>> val TAB = "\t"
>>
>> val support = 2
>>
>> val sc = new SparkContext(...)
>>
>> val raw = sc.textFile(...)
>>
>> val filtered = raw.map(
>>
>>   line => {
>>
>>     val lineSplit = line.split(TAB) // TAB is null and exception is
>> thrown during the run
>>     ...
>>
>>   }).filter( p => p._2 >= support) // support here is 0 during the run
>>
>> ...
>>
>> I run the sbt-assembly jar like "java -cp ..." on a standalone cluster, I
>> found out that when referenced in the RDD transformation, the 2 values, TAB
>> and support, are set to their default values. So TAB is null, and support
>> is 0 and no longer "\t" and 2 as they are initialized above.
>>
>> If the same jar is run locally (MASTER is local or local[k] instead of
>> spark://...) on the same input, it runs perfectly. The code also runs well
>> in spark-shell on cluster.
>>
>> For the jar to run correctly on cluster, I have to hard code the string
>> literal and the number in the RDD transformation part.
>>
>> It really seems to me a weird bug, maybe it has something to do with the
>> sbt-assembly jar compilation? Some suggestions?
>>
>> Thanks.
>>
>> I'm using spark version 0.7.3 and scala 2.9.3.
>>
>> --
>> *JU Han*
>>
>> Software Engineer Intern @ KXEN Inc.
>> UTC   -  Université de Technologie de Compiègne
>> *     **GI06 - Fouille de Données et Décisionnel*
>>
>> +33 0619608888
>>
>
>


-- 
*JU Han*

Software Engineer Intern @ KXEN Inc.
UTC   -  Université de Technologie de Compiègne
*     **GI06 - Fouille de Données et Décisionnel*

+33 0619608888

Mime
View raw message