spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From nitinkak001 <>
Subject Connected Components running for a long time and failing eventually
Date Mon, 24 Nov 2014 16:19:07 GMT
I am trying to run connected components on a graph generated by reading an
edge file. Its running for a long time(3-4 hrs) and then eventually failing.
Cant find any error in log file. The file I am testing it on has 27M
rows(edges). Is there something obviously wrong with the code?

I tested the same code with around 1000 rows input and it works just fine.

object ConnectedComponentsTest {
  def main(args: Array[String]) {
    val inputFile =
// Should be some file on your system
    val conf = new SparkConf().setAppName("ConnectedComponentsTest")
    val sc = new SparkContext(conf)
    val graph = GraphLoader.edgeListFile(sc, inputFile, true);
    val cc = graph.connectedComponents().vertices;

View this message in context:
Sent from the Apache Spark User List mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message