spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <so...@cloudera.com>
Subject Re: retry in combineByKey at BinaryClassificationMetrics.scala
Date Tue, 23 Dec 2014 20:09:22 GMT
Yes, my change is slightly downstream of this point in the processing
though. The code is still creating a counter for each distinct score
value, and then binning. I don't think that would cause a failure -
just might be slow. At the extremes, you might see 'fetch failure' as
a symptom of things running too slowly.

Yes you can sacrifice some fidelity by more aggressively binning
upstream, on your scores. That would drastically reduce the input
size, at the cost of accuracy of course.

On Tue, Dec 23, 2014 at 7:35 PM, Xiangrui Meng <mengxr@gmail.com> wrote:
> Sean's PR may be relevant to this issue
> (https://github.com/apache/spark/pull/3702). As a workaround, you can
> try to truncate the raw scores to 4 digits (e.g., 0.5643215 -> 0.5643)
> before sending it to BinaryClassificationMetrics. This may not work
> well if he score distribution is very skewed. See discussion on
> https://issues.apache.org/jira/browse/SPARK-4547 -Xiangrui
>
> On Tue, Dec 23, 2014 at 9:00 AM, Thomas Kwan <thomas.kwan@manage.com> wrote:
>> Hi there,
>>
>> We are using mllib 1.1.1, and doing Logistics Regression with a dataset of
>> about 150M rows.
>> The training part usually goes pretty smoothly without any retries. But
>> during the prediction stage and BinaryClassificationMetrics stage, I am
>> seeing retries with error of "fetch failure".
>>
>> The prediction part is just as follows:
>>
>>         val predictionAndLabel = testRDD.map { point =>
>>             val prediction = model.predict(point.features)
>>             (prediction, point.label)
>>         }
>> ...
>>         val metrics = new BinaryClassificationMetrics(predictionAndLabel)
>>
>> The fetch failure happened with the following stack trace:
>>
>> org.apache.spark.rdd.PairRDDFunctions.combineByKey(PairRDDFunctions.scala:515)
>>
>> org.apache.spark.mllib.evaluation.BinaryClassificationMetrics.x$3$lzycompute(BinaryClassificationMetrics.scala:101)
>>
>> org.apache.spark.mllib.evaluation.BinaryClassificationMetrics.x$3(BinaryClassificationMetrics.scala:96)
>>
>> org.apache.spark.mllib.evaluation.BinaryClassificationMetrics.confusions$lzycompute(BinaryClassificationMetrics.scala:98)
>>
>> org.apache.spark.mllib.evaluation.BinaryClassificationMetrics.confusions(BinaryClassificationMetrics.scala:98)
>>
>> org.apache.spark.mllib.evaluation.BinaryClassificationMetrics.createCurve(BinaryClassificationMetrics.scala:142)
>>
>> org.apache.spark.mllib.evaluation.BinaryClassificationMetrics.roc(BinaryClassificationMetrics.scala:50)
>>
>> org.apache.spark.mllib.evaluation.BinaryClassificationMetrics.areaUnderROC(BinaryClassificationMetrics.scala:60)
>>
>> com.manage.ml.evaluation.BinaryClassificationMetrics.areaUnderROC(BinaryClassificationMetrics.scala:14)
>>
>> ...
>>
>>
>> We are doing this in the yarn-client mode. 32 executors, 16G executor
>> memory, and 12 cores as the spark-submit settings.
>>
>> I wonder if anyone has suggestion on how to debug this.
>>
>> thanks in advance
>> thomas
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message