spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sonal Goyal <sonalgoy...@gmail.com>
Subject Re: Using a Database to persist and load data from
Date Fri, 31 Oct 2014 08:03:55 GMT
I think you can try to use the Hadoop DBOutputFormat

Best Regards,
Sonal
Nube Technologies <http://www.nubetech.co>

<http://in.linkedin.com/in/sonalgoyal>



On Fri, Oct 31, 2014 at 1:00 PM, Kamal Banga <kamal@sigmoidanalytics.com>
wrote:

> You can also use PairRDDFunctions' saveAsNewAPIHadoopFile that takes an
> OutputFormat class.
> So you will have to write a custom OutputFormat class that extends
> OutputFormat. In this class, you will have to implement a getRecordWriter
> which returns a custom RecordWriter.
> So you will also have to write a custom RecordWriter which extends
> RecordWriter which will have a write method that actually writes to the DB.
>
> On Fri, Oct 31, 2014 at 11:25 AM, Yanbo Liang <yanbohappy@gmail.com>
> wrote:
>
>> AFAIK, you can read data from DB with JdbcRDD, but there is no interface
>> for writing to DB.
>> JdbcRDD has some restrict such as  SQL must with "where" clause.
>> For writing to DB, you can use mapPartitions or foreachPartition to
>> implement.
>> You can refer this example:
>>
>> http://stackoverflow.com/questions/24916852/how-can-i-connect-to-a-postgresql-database-into-apache-spark-using-scala
>>
>> 2014-10-30 23:01 GMT+08:00 Asaf Lahav <asaf.lahav@gmail.com>:
>>
>>> Hi Ladies and Gents,
>>> I would like to know what are the options I have if I would like to
>>> leverage Spark code I already have written to use a DB (Vertica) as its
>>> store/datasource.
>>> The data is of tabular nature. So any relational DB can essentially be
>>> used.
>>>
>>> Do I need to develop a context? If yes, how? where can I get a good
>>> example?
>>>
>>>
>>> Thank you,
>>> Asaf
>>>
>>
>>
>

Mime
View raw message