spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From qinwei <>
Subject about write mongodb in mapPartitions
Date Fri, 07 Nov 2014 09:23:09 GMT

Hi, everyone
    I come across with a prolem about writing data to mongodb in mapPartitions, my code is
as below:                 val sourceRDD = sc.textFile("hdfs://host:port/sourcePath")     
    // some transformations        val rdd= sourceRDD .map(mapFunc).filter(filterFunc)     
  val newRDD = rdd.mapPartitions(args => {             val mongoClient = new MongoClient("host",
            val db = mongoClient.getDB("db") 
            val coll = db.getCollection("collectionA") 

   => { 
                coll.insert(new BasicDBObject("pkg", arg)) 

        })            newRDD.saveAsTextFile("hdfs://host:port/path")        The application
saved data to HDFS correctly, but not mongodb, is there someting wrong?    I know that collecting
the newRDD to driver and then saving it to mongodb will success, but will the following saveAsTextFile read
the filesystem once again?


View raw message