spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Xiangrui Meng <men...@gmail.com>
Subject Re: MLlib Spam example gets stuck in Stage X
Date Fri, 27 Mar 2015 18:23:59 GMT
Hi Su,

I'm not sure what the problem is. Did you try other Spark examples on your
cluster? Did they work? Could you try

trainingData.count()

before calling lrLearn.run()? Just want to check whether this is an MLlib
issue.

Thanks,
Xiangrui

On Wed, Mar 25, 2015 at 3:27 PM, Su She <suhshekar52@gmail.com> wrote:

> Hello Everyone,
>
> I was hoping to see if anyone has any additional thoughts on this as I was
> able to find barely anything related to this error online (something
> related to dependencies/breeze?)...thank you!
>
> Best,
>
> Su
>
> On Thu, Mar 19, 2015 at 10:54 AM, Su She <suhshekar52@gmail.com> wrote:
>
>> Hello Akhil,
>>
>> I tried running it in an application, and I got the same result. The app
>> gets stuck in Stage 1 at MLlib.scala at line 32 which in my app corresponds
>> to: val model = lrLearner.run(trainingData).
>>
>> These are the details:
>>
>> org.apache.spark.rdd.RDD.count(RDD.scala:910)
>> org.apache.spark.mllib.util.DataValidators$$anonfun$1.apply(DataValidators.scala:38)
>> org.apache.spark.mllib.util.DataValidators$$anonfun$1.apply(DataValidators.scala:37)
>> org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm$$anonfun$run$2.apply(GeneralizedLinearAlgorithm.scala:161)
>> org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm$$anonfun$run$2.apply(GeneralizedLinearAlgorithm.scala:161)
>> scala.collection.LinearSeqOptimized$class.forall(LinearSeqOptimized.scala:70)
>> scala.collection.immutable.List.forall(List.scala:84)
>> org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm.run(GeneralizedLinearAlgorithm.scala:161)
>> org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm.run(GeneralizedLinearAlgorithm.scala:146)
>> MLlib$.main(MLlib.scala:32)
>> MLlib.main(MLlib.scala)
>> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> java.lang.reflect.Method.invoke(Method.java:606)
>> org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:358)
>> org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
>> org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>>
>>
>> Thank you for the help Akhil!
>>
>> Best,
>>
>> Su
>>
>>
>> On Thu, Mar 19, 2015 at 1:27 AM, Akhil Das <akhil@sigmoidanalytics.com>
>> wrote:
>>
>>> It seems its stuck at doing a count? What happening at line 38? I'm not
>>> seeing count operation in this code  anywhere
>>> https://github.com/databricks/learning-spark/blob/master/src/main/scala/com/oreilly/learningsparkexamples/scala/MLlib.scala#L48
>>>
>>> Thanks
>>> Best Regards
>>>
>>> On Thu, Mar 19, 2015 at 1:32 PM, Su She <suhshekar52@gmail.com> wrote:
>>>
>>>> Hello Akhil,
>>>>
>>>> Thanks for the info! Here is my UI...I am not sure what to make of the
>>>> information here:
>>>>
>>>> [image: Inline image 1]
>>>>
>>>> [image: Inline image 2]
>>>>
>>>> Details of active stage:
>>>>
>>>> org.apache.spark.rdd.RDD.count(RDD.scala:910)
>>>> org.apache.spark.mllib.util.DataValidators$$anonfun$1.apply(DataValidators.scala:38)
>>>> org.apache.spark.mllib.util.DataValidators$$anonfun$1.apply(DataValidators.scala:37)
>>>> org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm$$anonfun$run$2.apply(GeneralizedLinearAlgorithm.scala:161)
>>>> org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm$$anonfun$run$2.apply(GeneralizedLinearAlgorithm.scala:161)
>>>> scala.collection.LinearSeqOptimized$class.forall(LinearSeqOptimized.scala:70)
>>>> scala.collection.immutable.List.forall(List.scala:84)
>>>> org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm.run(GeneralizedLinearAlgorithm.scala:161)
>>>> org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm.run(GeneralizedLinearAlgorithm.scala:146)
>>>> $line21.$read$$iwC$$iwC$$iwC$$iwC.<init>(<console>:33)
>>>> $line21.$read$$iwC$$iwC$$iwC.<init>(<console>:38)
>>>> $line21.$read$$iwC$$iwC.<init>(<console>:40)
>>>> $line21.$read$$iwC.<init>(<console>:42)
>>>> $line21.$read.<init>(<console>:44)
>>>> $line21.$read$.<init>(<console>:48)
>>>> $line21.$read$.<clinit>(<console>)
>>>> $line21.$eval$.<init>(<console>:7)
>>>> $line21.$eval$.<clinit>(<console>)
>>>> $line21.$eval.$print(<console>)
>>>> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>
>>>>
>>>> Thank you for the help Akhil!
>>>>
>>>> -Su
>>>>
>>>> On Thu, Mar 19, 2015 at 12:49 AM, Akhil Das <akhil@sigmoidanalytics.com
>>>> > wrote:
>>>>
>>>>> To get these metrics out, you need to open the driver ui running on
>>>>> port 4040. And in there you will see Stages information and for each
stage
>>>>> you can see how much time it is spending on GC etc. In your case, the
>>>>> parallelism seems 4, the more # of parallelism the more # of tasks you
will
>>>>> see.
>>>>>
>>>>> Thanks
>>>>> Best Regards
>>>>>
>>>>> On Thu, Mar 19, 2015 at 1:15 PM, Su She <suhshekar52@gmail.com>
wrote:
>>>>>
>>>>>> Hi Akhil,
>>>>>>
>>>>>> 1) How could I see how much time it is spending on stage 1? Or what
>>>>>> if, like above, it doesn't get past stage 1?
>>>>>>
>>>>>> 2) How could I check if its a GC time? and where would I increase
the
>>>>>> parallelism for the model? I have a Spark Master and 2 Workers running
on
>>>>>> CDH 5.3...what would the default spark-shell level of parallelism
be...I
>>>>>> thought it would be 3?
>>>>>>
>>>>>> Thank you for the help!
>>>>>>
>>>>>> -Su
>>>>>>
>>>>>>
>>>>>> On Thu, Mar 19, 2015 at 12:32 AM, Akhil Das <
>>>>>> akhil@sigmoidanalytics.com> wrote:
>>>>>>
>>>>>>> Can you see where exactly it is spending time? Like you said
it goes
>>>>>>> to Stage 2, then you will be able to see how much time it spend
on Stage 1.
>>>>>>> See if its a GC time, then try increasing the level of parallelism
or
>>>>>>> repartition it like sc.getDefaultParallelism*3.
>>>>>>>
>>>>>>> Thanks
>>>>>>> Best Regards
>>>>>>>
>>>>>>> On Thu, Mar 19, 2015 at 12:15 PM, Su She <suhshekar52@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hello Everyone,
>>>>>>>>
>>>>>>>> I am trying to run this MLlib example from Learning Spark:
>>>>>>>>
>>>>>>>> https://github.com/databricks/learning-spark/blob/master/src/main/scala/com/oreilly/learningsparkexamples/scala/MLlib.scala#L48
>>>>>>>>
>>>>>>>> Things I'm doing differently:
>>>>>>>>
>>>>>>>> 1) Using spark shell instead of an application
>>>>>>>>
>>>>>>>> 2) instead of their spam.txt and normal.txt I have text files
with
>>>>>>>> 3700 and 2700 words...nothing huge at all and just plain
text
>>>>>>>>
>>>>>>>> 3) I've used numFeatures = 100, 1000 and 10,000
>>>>>>>>
>>>>>>>> *Error: *I keep getting stuck when I try to run the model:
>>>>>>>>
>>>>>>>> val model = new LogisticRegressionWithSGD().run(trainingData)
>>>>>>>>
>>>>>>>> It will freeze on something like this:
>>>>>>>>
>>>>>>>> [Stage 1:==============>
>>>>>>>>  (1 + 0) / 4]
>>>>>>>>
>>>>>>>> Sometimes its Stage 1, 2 or 3.
>>>>>>>>
>>>>>>>> I am not sure what I am doing wrong...any help is much appreciated,
>>>>>>>> thank you!
>>>>>>>>
>>>>>>>> -Su
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message