spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas Nieborowski <orgon...@gmail.com>
Subject Re: Save an RDD to a SQL Database
Date Thu, 07 Aug 2014 15:14:46 GMT
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/rdd/JdbcRDD.scala



On Thu, Aug 7, 2014 at 8:08 AM, 诺铁 <notyycn@gmail.com> wrote:

> I haven't seen people write directly to sql database,
> mainly because it's difficult to deal with failure,
> what if network broken in half of the process?  should we drop all data in
> database and restart from beginning?  if the process is "Appending" data to
> database, then things becomes even complex.
>
> but if this process can be doable, it would be a very good thing.
>
>
> On Wed, Aug 6, 2014 at 11:24 PM, Yana <yana.kadiyska@gmail.com> wrote:
>
>> Hi Vida,
>>
>> I am writing to a DB -- or trying to :).
>>
>> I believe the best practice for this (you can search the mailing list
>> archives) is to do a combination of mapPartitions and use a grouped
>> iterator.
>> Look at this thread, esp. the comment from A. Boisvert and Matei's comment
>> above it:
>> https://groups.google.com/forum/#!topic/spark-users/LUb7ZysYp2k
>>
>> Basically the short story is that you want to open as few connections as
>> possible but write more than 1 insert at a time.
>>
>>
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/Save-an-RDD-to-a-SQL-Database-tp11516p11549.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>> For additional commands, e-mail: user-help@spark.apache.org
>>
>>
>


-- 
Thomas Nieborowski
510-207-7049 mobile
510-339-1716 home

Mime
View raw message