spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yana Kadiyska <>
Subject Re: DataFrame insertIntoJDBC parallelism while writing data into a DB table
Date Wed, 17 Jun 2015 01:20:45 GMT
When all else fails look at the source ;)

Looks like createJDBCTable is deprecated, but otherwise goes to the same
implementation as insertIntoJDBC...

You can also look at DataFrameWriter in the same package...Looks like all
that code will eventually write via JDBCWriteDetails in
I'm reading this correctly you'll have simultaneous writes from each
partition but they don't appear to be otherwise batched (if you were
thinking bulk inserts)

On Mon, Jun 15, 2015 at 1:20 PM, Mohammad Tariq <> wrote:

> Hello list,
> The method *insertIntoJDBC(url: String, table: String, overwrite:
> Boolean)* provided by Spark DataFrame allows us to copy a DataFrame into
> a JDBC DB table. Similar functionality is provided by the *createJDBCTable(url:
> String, table: String, allowExisting: Boolean) *method. But if you look
> at the docs it says that *createJDBCTable *runs a *bunch of Insert
> statements* in order to copy the data. While the docs of *insertIntoJDBC *doesn't
> have any such statement.
> Could someone please shed some light on this? How exactly data gets
> inserted using *insertIntoJDBC *method?
> And if it is same as *createJDBCTable *method, then what exactly does *bunch
> of Insert statements* mean? What's the criteria to decide the number
> *inserts/bunch*? How are these bunches generated?
> *An example* could be creating a DataFrame by reading all the files
> stored in a given directory. If I just do **, it'll
> create the same number of output files as the input files. What'll happen
> in case of *DataFrame.df.insertIntoJDBC()*?
> I'm really sorry to be pest of questions, but I could net get much help by
> Googling about this.
> Thank you so much for your valuable time. really appreciate it.
> [image: http://]
> Tariq, Mohammad
> [image: http://]
> <>

View raw message