spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vinti Maheshwari <vinti.u...@gmail.com>
Subject Need help in spark-Scala program
Date Mon, 01 Feb 2016 22:25:42 GMT
Hi All,

I recently started learning Spark. I need to use spark-streaming.

1) Input, need to read from MongoDB

db.event_gcovs.find({executions:"56791a746e928d7b176d03c0", valid:1,
infofile:{$exists:1}, geo:"sunnyvale"}, {infofile:1}).count()

> Number of Info files: 24441

/* 0 */

{

"_id" : ObjectId("568eaeda71404e5c563ccb86"),

    "infofile" :
"/volume/testtech/datastore/code-coverage/p//infos/svl/6/56791a746e928d7b176d03c0/
69958.pcp_napt44_20368.pl.30090.exhibit.R0-re0.15.1I20151218_1934_jammyc.pfe.i386.TC011.fail.FAIL.gcov.info
"
}

One info file can have 1000 of  these blocks( Each block starts from "SF"
delimeter, and ends with the end_of_record.

Mime
View raw message