spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ashutosh <>
Subject Re: [MLlib] Contributing Algorithm for Outlier Detection
Date Tue, 28 Oct 2014 09:45:01 GMT
Hi Anant,

Thank you for reviewing and helping us out. Please find the following link
where you can see the initial code.

The input file for the code should be in csv format. We have provided a
dataset there at the link.

We are currently facing the following style issues in the code(code is
working fine though) :

At line no 62 and 79 we have redundant functions and variables
(count_dataPoint, count_trimmedData) for giving  line numbers within the
function trimScores().
At line no 144 and 149 if we do not use two separate functions to increment
line numbers we get erroneous results . Is there any alternative way of
handling that?

We think that it because of scala clousers where any local variable which is
not in RDD doesn't get updated in subsequent pairRDDFunctions.

Ashutosh & Kaushik 

View this message in context:
Sent from the Apache Spark Developers List mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message