spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ashutosh <ashutosh.triv...@iiitb.org>
Subject Re: [MLlib] Contributing Algorithm for Outlier Detection
Date Tue, 28 Oct 2014 09:45:01 GMT
Hi Anant,

Thank you for reviewing and helping us out. Please find the following link
where you can see the initial code.
https://github.com/codeAshu/Outlier-Detection-with-AVF-Spark/blob/master/OutlierWithAVFModel.scala


The input file for the code should be in csv format. We have provided a
dataset there at the link.

We are currently facing the following style issues in the code(code is
working fine though) :

At line no 62 and 79 we have redundant functions and variables
(count_dataPoint, count_trimmedData) for giving  line numbers within the
function trimScores().
 
At line no 144 and 149 if we do not use two separate functions to increment
line numbers we get erroneous results . Is there any alternative way of
handling that?

We think that it because of scala clousers where any local variable which is
not in RDD doesn't get updated in subsequent pairRDDFunctions.


Regards,
Ashutosh & Kaushik 



--
View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/MLlib-Contributing-Algorithm-for-Outlier-Detection-tp8880p8990.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Mime
View raw message