Hi, I am using <spark.version>1.3.0</spark.version>

I am not getting constructor for above values

Inline image 1

So, i tried to shuffle the values in constructor .
Inline image 2

But, it is giving this error.Please suggest
Inline image 3

Best Regards

On Mon, Aug 31, 2015 at 12:43 PM, Akhil Das <akhil@sigmoidanalytics.com> wrote:
Can you try with these key value classes and see the performance? 

inputFormatClassName = "com.mongodb.hadoop.MongoInputFormat"

keyClassName = "org.apache.hadoop.io.Text"
valueClassName = "org.apache.hadoop.io.MapWritable"

Taken from databricks blog

Thanks
Best Regards

On Mon, Aug 31, 2015 at 12:26 PM, Deepesh Maheshwari <deepesh.maheshwari17@gmail.com> wrote:
Hi, I am trying to read mongodb in Spark newAPIHadoopRDD.

/**** Code *****/

config.set("mongo.job.input.format", "com.mongodb.hadoop.MongoInputFormat");
config.set("mongo.input.uri",SparkProperties.MONGO_OUTPUT_URI);
config.set("mongo.input.query","{host: 'abc.com'}");

JavaSparkContext sc=new JavaSparkContext("local", "MongoOps");

        JavaPairRDD<Object, BSONObject> mongoRDD = sc.newAPIHadoopRDD(config,
                com.mongodb.hadoop.MongoInputFormat.class, Object.class,
                BSONObject.class);
       
        long count=mongoRDD.count();

There are about 1.5million record.
Though i am getting data but read operation took around 15min to read whole.

Is this Api really too slow or am i missing something.
Please suggest if there is an alternate approach to read data from Mongo faster.

Thanks,
Deepesh