spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Xiangrui Meng <men...@gmail.com>
Subject Re: MLlib Spam example gets stuck in Stage X
Date Mon, 30 Mar 2015 17:16:43 GMT
+Holden, Joseph

It seems that there is something wrong with the sample data file:
https://github.com/databricks/learning-spark/blob/master/files/ham.txt

-Xiangrui

On Fri, Mar 27, 2015 at 1:03 PM, Su She <suhshekar52@gmail.com> wrote:

> Hello Xiangrui,
>
> Hmm, yes I have run other Spark (word count, spark streaming/kafka, etc)
> examples locally, the same way I'm trying to run this MLlib example (i've
> tried local[2] and local [4]).
>
> 1) I did trainingData.count() and the job was completed. The output was
> 2...should this only be 2 or 400 (since each text file has 200 words)?
>
> 2) I noticed the code says: val trainingData = positiveExamples ++
> negativeExamples
>
> I'm not very familiar with scala, but the ++ sign seems weird to me, but
> when I tried to only have one + sign, it did not build
>
> 3) I found a similar thread here...
> http://mail-archives.apache.org/mod_mbox/spark-user/201409.mbox/%3CCAFRXrqf6DRxLCSB7q1-W1PuAYAdPnB8WOmwiWE_8++Okq2cArg@mail.gmail.com%3E
>
> it looks like Emily had the same problem (count at DataValidators.scala:38
> <http://localhost:4040/stages/stage?id=2&attempt=0>), but doesn't seem
> like a solution was found. Also, I don't get any of those errors printed to
> the console.
>
> 4) sorry, not sure what else to say, as this is a pretty basic example.
> thank you for the help!
>
> best,
>
> Su
>
> On Fri, Mar 27, 2015 at 11:23 AM, Xiangrui Meng <mengxr@gmail.com> wrote:
>
>> Hi Su,
>>
>> I'm not sure what the problem is. Did you try other Spark examples on
>> your cluster? Did they work? Could you try
>>
>> trainingData.count()
>>
>> before calling lrLearn.run()? Just want to check whether this is an MLlib
>> issue.
>>
>> Thanks,
>> Xiangrui
>>
>> On Wed, Mar 25, 2015 at 3:27 PM, Su She <suhshekar52@gmail.com> wrote:
>>
>>> Hello Everyone,
>>>
>>> I was hoping to see if anyone has any additional thoughts on this as I
>>> was able to find barely anything related to this error online (something
>>> related to dependencies/breeze?)...thank you!
>>>
>>> Best,
>>>
>>> Su
>>>
>>> On Thu, Mar 19, 2015 at 10:54 AM, Su She <suhshekar52@gmail.com> wrote:
>>>
>>>> Hello Akhil,
>>>>
>>>> I tried running it in an application, and I got the same result. The
>>>> app gets stuck in Stage 1 at MLlib.scala at line 32 which in my app
>>>> corresponds to: val model = lrLearner.run(trainingData).
>>>>
>>>> These are the details:
>>>>
>>>> org.apache.spark.rdd.RDD.count(RDD.scala:910)
>>>> org.apache.spark.mllib.util.DataValidators$$anonfun$1.apply(DataValidators.scala:38)
>>>> org.apache.spark.mllib.util.DataValidators$$anonfun$1.apply(DataValidators.scala:37)
>>>> org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm$$anonfun$run$2.apply(GeneralizedLinearAlgorithm.scala:161)
>>>> org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm$$anonfun$run$2.apply(GeneralizedLinearAlgorithm.scala:161)
>>>> scala.collection.LinearSeqOptimized$class.forall(LinearSeqOptimized.scala:70)
>>>> scala.collection.immutable.List.forall(List.scala:84)
>>>> org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm.run(GeneralizedLinearAlgorithm.scala:161)
>>>> org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm.run(GeneralizedLinearAlgorithm.scala:146)
>>>> MLlib$.main(MLlib.scala:32)
>>>> MLlib.main(MLlib.scala)
>>>> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>> java.lang.reflect.Method.invoke(Method.java:606)
>>>> org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:358)
>>>> org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
>>>> org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>>>>
>>>>
>>>> Thank you for the help Akhil!
>>>>
>>>> Best,
>>>>
>>>> Su
>>>>
>>>>
>>>> On Thu, Mar 19, 2015 at 1:27 AM, Akhil Das <akhil@sigmoidanalytics.com>
>>>> wrote:
>>>>
>>>>> It seems its stuck at doing a count? What happening at line 38? I'm
>>>>> not seeing count operation in this code  anywhere
>>>>> https://github.com/databricks/learning-spark/blob/master/src/main/scala/com/oreilly/learningsparkexamples/scala/MLlib.scala#L48
>>>>>
>>>>> Thanks
>>>>> Best Regards
>>>>>
>>>>> On Thu, Mar 19, 2015 at 1:32 PM, Su She <suhshekar52@gmail.com>
wrote:
>>>>>
>>>>>> Hello Akhil,
>>>>>>
>>>>>> Thanks for the info! Here is my UI...I am not sure what to make of
>>>>>> the information here:
>>>>>>
>>>>>> [image: Inline image 1]
>>>>>>
>>>>>> [image: Inline image 2]
>>>>>>
>>>>>> Details of active stage:
>>>>>>
>>>>>> org.apache.spark.rdd.RDD.count(RDD.scala:910)
>>>>>> org.apache.spark.mllib.util.DataValidators$$anonfun$1.apply(DataValidators.scala:38)
>>>>>> org.apache.spark.mllib.util.DataValidators$$anonfun$1.apply(DataValidators.scala:37)
>>>>>> org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm$$anonfun$run$2.apply(GeneralizedLinearAlgorithm.scala:161)
>>>>>> org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm$$anonfun$run$2.apply(GeneralizedLinearAlgorithm.scala:161)
>>>>>> scala.collection.LinearSeqOptimized$class.forall(LinearSeqOptimized.scala:70)
>>>>>> scala.collection.immutable.List.forall(List.scala:84)
>>>>>> org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm.run(GeneralizedLinearAlgorithm.scala:161)
>>>>>> org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm.run(GeneralizedLinearAlgorithm.scala:146)
>>>>>> $line21.$read$$iwC$$iwC$$iwC$$iwC.<init>(<console>:33)
>>>>>> $line21.$read$$iwC$$iwC$$iwC.<init>(<console>:38)
>>>>>> $line21.$read$$iwC$$iwC.<init>(<console>:40)
>>>>>> $line21.$read$$iwC.<init>(<console>:42)
>>>>>> $line21.$read.<init>(<console>:44)
>>>>>> $line21.$read$.<init>(<console>:48)
>>>>>> $line21.$read$.<clinit>(<console>)
>>>>>> $line21.$eval$.<init>(<console>:7)
>>>>>> $line21.$eval$.<clinit>(<console>)
>>>>>> $line21.$eval.$print(<console>)
>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>>
>>>>>>
>>>>>> Thank you for the help Akhil!
>>>>>>
>>>>>> -Su
>>>>>>
>>>>>> On Thu, Mar 19, 2015 at 12:49 AM, Akhil Das <
>>>>>> akhil@sigmoidanalytics.com> wrote:
>>>>>>
>>>>>>> To get these metrics out, you need to open the driver ui running
on
>>>>>>> port 4040. And in there you will see Stages information and for
each stage
>>>>>>> you can see how much time it is spending on GC etc. In your case,
the
>>>>>>> parallelism seems 4, the more # of parallelism the more # of
tasks you will
>>>>>>> see.
>>>>>>>
>>>>>>> Thanks
>>>>>>> Best Regards
>>>>>>>
>>>>>>> On Thu, Mar 19, 2015 at 1:15 PM, Su She <suhshekar52@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi Akhil,
>>>>>>>>
>>>>>>>> 1) How could I see how much time it is spending on stage
1? Or what
>>>>>>>> if, like above, it doesn't get past stage 1?
>>>>>>>>
>>>>>>>> 2) How could I check if its a GC time? and where would I
increase
>>>>>>>> the parallelism for the model? I have a Spark Master and
2 Workers running
>>>>>>>> on CDH 5.3...what would the default spark-shell level of
parallelism be...I
>>>>>>>> thought it would be 3?
>>>>>>>>
>>>>>>>> Thank you for the help!
>>>>>>>>
>>>>>>>> -Su
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Mar 19, 2015 at 12:32 AM, Akhil Das <
>>>>>>>> akhil@sigmoidanalytics.com> wrote:
>>>>>>>>
>>>>>>>>> Can you see where exactly it is spending time? Like you
said it
>>>>>>>>> goes to Stage 2, then you will be able to see how much
time it spend on
>>>>>>>>> Stage 1. See if its a GC time, then try increasing the
level of parallelism
>>>>>>>>> or repartition it like sc.getDefaultParallelism*3.
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>> Best Regards
>>>>>>>>>
>>>>>>>>> On Thu, Mar 19, 2015 at 12:15 PM, Su She <suhshekar52@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hello Everyone,
>>>>>>>>>>
>>>>>>>>>> I am trying to run this MLlib example from Learning
Spark:
>>>>>>>>>>
>>>>>>>>>> https://github.com/databricks/learning-spark/blob/master/src/main/scala/com/oreilly/learningsparkexamples/scala/MLlib.scala#L48
>>>>>>>>>>
>>>>>>>>>> Things I'm doing differently:
>>>>>>>>>>
>>>>>>>>>> 1) Using spark shell instead of an application
>>>>>>>>>>
>>>>>>>>>> 2) instead of their spam.txt and normal.txt I have
text files
>>>>>>>>>> with 3700 and 2700 words...nothing huge at all and
just plain text
>>>>>>>>>>
>>>>>>>>>> 3) I've used numFeatures = 100, 1000 and 10,000
>>>>>>>>>>
>>>>>>>>>> *Error: *I keep getting stuck when I try to run the
model:
>>>>>>>>>>
>>>>>>>>>> val model = new LogisticRegressionWithSGD().run(trainingData)
>>>>>>>>>>
>>>>>>>>>> It will freeze on something like this:
>>>>>>>>>>
>>>>>>>>>> [Stage 1:==============>
>>>>>>>>>>    (1 + 0) / 4]
>>>>>>>>>>
>>>>>>>>>> Sometimes its Stage 1, 2 or 3.
>>>>>>>>>>
>>>>>>>>>> I am not sure what I am doing wrong...any help is
much
>>>>>>>>>> appreciated, thank you!
>>>>>>>>>>
>>>>>>>>>> -Su
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message