spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Silvio Fiorito <silvio.fior...@granturing.com>
Subject Re: custom RDD in java
Date Wed, 01 Jul 2015 14:59:52 GMT
If all you’re doing is just dumping tables from SQLServer to HDFS, have you looked at Sqoop?

Otherwise, if you need to run this in Spark could you just use the existing JdbcRDD?


From: Shushant Arora
Date: Wednesday, July 1, 2015 at 10:19 AM
To: user
Subject: custom RDD in java

Hi

Is it possible to write custom RDD in java?

Requirement is - I am having a list of Sqlserver tables  need to be dumped in HDFS.

So I have a
List<String> tables = {dbname.tablename,dbname.tablename2......};

then
JavaRDD<String> rdd = javasparkcontext.parllelise(tables);

JavaRDDString> tablecontent = rdd.map(new Function<String,Iterable<String>>){fetch
table and return populate iterable}

tablecontent.storeAsTextFile("hffs path");


In rdd.map(new Function<String,>). I cannot keep complete table content in memory ,
so I want to creat my own RDD to handle it.

Thanks
Shushant






Mime
View raw message