Great! I was going to implement one of my own - but I may not need to do that any more :)
I haven't had a chance to look deep into your code but I would recommend accepting an RDD[Double,Double] as well, instead of just a file.
val data = IOHelper.readDataset(sc, "/path/to/my/data.csv")
And other distance measures ofcourse.
I'm not sure if messages like this are appropriate in this list; I just want to share with you an application I am working on. This is my personal project which I started to learn more about Spark and Scala, and, if it succeeds, to contribute it to the Spark community.
Maybe someone will find it useful. Or maybe someone will want to join development.
Any questions, comments, suggestions, as well as criticism are welcome :)