spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Imberman <daniel.imber...@gmail.com>
Subject Reading an RDD from a checkpoint.
Date Mon, 14 Mar 2016 19:00:45 GMT
So I'm attempting to pre-compute my data such that I can pull an RDD from a
checkpoint. However, I'm finding that upon running the same job twice the
system is simply recreating the RDD from scratch.

Here is the code I'm implementing to create the checkpoint:

  def checkpointTeam(checkpointDir:String) ={
    sparkContext.setCheckpointDir(checkpointDir)
    val a = ObjectsTable.readAsRDD(sparkContext,
Some("5507424...")).persist() //read from HBase
    a.checkpoint()
    a.count()
  }

I've checked multiple sources but none of them seem to explicitly state how
to read the values. I'd like to be able to treat the checkpoint as a
snapshot so I can read it from other jobs.



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Reading-an-RDD-from-a-checkpoint-tp26486.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message