spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yifan LI <iamyifa...@gmail.com>
Subject Re: [GraphX] how to set memory configurations to avoid OutOfMemoryError "GC overhead limit exceeded"
Date Fri, 05 Sep 2014 10:13:18 GMT
Thank you, Ankur! :)

But how to assign the storage level to a new vertices RDD that mapped from
an existing vertices RDD,
e.g.
*val newVertexRDD =
graph.collectNeighborIds(EdgeDirection.Out).map{case(id:VertexId,
a:Array[VertexId]) => (id, initialHashMap(a))}*

the new one will be combined with that existing edges RDD(MEMORY_AND_DISK)
to construct a new graph.
e.g.
val newGraph = Graph(newVertexRDD, graph.edges)


BTW, the return of newVertexRDD.getStorageLevel is StorageLevel(true, true,
false, true, 1), what does it mean?

Thanks in advance!

Best,
Yifan



2014-09-03 22:42 GMT+02:00 Ankur Dave <ankurdave@gmail.com>:

> At 2014-09-03 17:58:09 +0200, Yifan LI <iamyifanli@gmail.com> wrote:
> > val graph = GraphLoader.edgeListFile(sc, edgesFile, minEdgePartitions =
> numPartitions).partitionBy(PartitionStrategy.EdgePartition2D).persist(StorageLevel.MEMORY_AND_DISK)
> >
> > Error: java.lang.UnsupportedOperationException: Cannot change storage
> level
> > of an RDD after it was already assigned a level
>
> You have to pass the StorageLevel to GraphLoader.edgeListFile:
>
>     val graph = GraphLoader.edgeListFile(
>       sc, edgesFile, minEdgePartitions = numPartitions,
>       edgeStorageLevel = StorageLevel.MEMORY_AND_DISK,
>       vertexStorageLevel = StorageLevel.MEMORY_AND_DISK)
>       .partitionBy(PartitionStrategy.EdgePartition2D)
>
> Ankur
>

Mime
View raw message