spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Silvio Fiorito <>
Subject Re: custom RDD in java
Date Wed, 01 Jul 2015 14:59:52 GMT
If all you’re doing is just dumping tables from SQLServer to HDFS, have you looked at Sqoop?

Otherwise, if you need to run this in Spark could you just use the existing JdbcRDD?

From: Shushant Arora
Date: Wednesday, July 1, 2015 at 10:19 AM
To: user
Subject: custom RDD in java


Is it possible to write custom RDD in java?

Requirement is - I am having a list of Sqlserver tables  need to be dumped in HDFS.

So I have a
List<String> tables = {dbname.tablename,dbname.tablename2......};

JavaRDD<String> rdd = javasparkcontext.parllelise(tables);

JavaRDDString> tablecontent = Function<String,Iterable<String>>){fetch
table and return populate iterable}

tablecontent.storeAsTextFile("hffs path");

In Function<String,>). I cannot keep complete table content in memory ,
so I want to creat my own RDD to handle it.


View raw message