spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shushant Arora <shushantaror...@gmail.com>
Subject Re: custom RDD in java
Date Wed, 01 Jul 2015 17:44:52 GMT
ok..will evaluate these options but is it possible to create RDD in java?


On Wed, Jul 1, 2015 at 8:29 PM, Silvio Fiorito <
silvio.fiorito@granturing.com> wrote:

>  If all you’re doing is just dumping tables from SQLServer to HDFS, have
> you looked at Sqoop?
>
>  Otherwise, if you need to run this in Spark could you just use the
> existing JdbcRDD?
>
>
>   From: Shushant Arora
> Date: Wednesday, July 1, 2015 at 10:19 AM
> To: user
> Subject: custom RDD in java
>
>   Hi
>
>  Is it possible to write custom RDD in java?
>
>  Requirement is - I am having a list of Sqlserver tables  need to be
> dumped in HDFS.
>
>  So I have a
> List<String> tables = {dbname.tablename,dbname.tablename2......};
>
>  then
> JavaRDD<String> rdd = javasparkcontext.parllelise(tables);
>
>  JavaRDDString> tablecontent = rdd.map(new
> Function<String,Iterable<String>>){fetch table and return populate iterable}
>
>  tablecontent.storeAsTextFile("hffs path");
>
>
>  In rdd.map(new Function<String,>). I cannot keep complete table content
> in memory , so I want to creat my own RDD to handle it.
>
>  Thanks
> Shushant
>
>
>
>
>
>
>

Mime
View raw message