spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kevin Burton <bur...@spinn3r.com>
Subject saveAsObjectFile is actually saveAsSequenceFile
Date Tue, 13 Jan 2015 08:39:21 GMT
This is interesting.

I’m using ObjectInputStream to try to read a file written as
saveAsObjectFile… but it’s not working.

The documentation says:

"Write the elements of the dataset in a simple format using Java
serialization, which can then be loaded using SparkContext.objectFile().”

… but that’s not right.

  def saveAsObjectFile(path: String) {
    this.mapPartitions(iter => iter.grouped(10).map(_.toArray))
      .map(x => (NullWritable.get(), new BytesWritable(Utils.serialize(x))))
      .saveAsSequenceFile(path)
  }

.. am I correct to assume that each entry is a serialized object BUT that
the entire thing is wrapped as a sequence file?

-- 

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile
<https://plus.google.com/102718274791889610666/posts>
<http://spinn3r.com>

Mime
View raw message