spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mich Talebzadeh <>
Subject Re: Sqoop vs spark jdbc
Date Wed, 24 Aug 2016 22:07:20 GMT
Personally I prefer Spark JDBC.

Both Sqoop and Spark rely on the same drivers.

I think Spark is faster and if you have many nodes you can partition your
incoming data and take advantage of Spark DAG + in memory offering.

By default Sqoop will use Map-reduce which is pretty slow.

Remember for Spark you will need to have sufficient memory


Dr Mich Talebzadeh

LinkedIn *

*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.

On 24 August 2016 at 22:39, Venkata Penikalapati <> wrote:

> Team,
> Please help me in choosing sqoop or spark jdbc to fetch data from rdbms.
> Sqoop has lot of optimizations to fetch data does spark jdbc also has those
> ?
> I'm performing few analytics using spark data for which data is residing
> in rdbms.
> Please guide me with this.
> Thanks
> Venkata Karthik P

View raw message