spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 郭鹏飞 <guopengfei19...@126.com>
Subject Re: Hive From Spark: Jdbc VS sparkContext
Date Tue, 10 Oct 2017 10:23:19 GMT

> 在 2017年10月4日,上午2:08,Nicolas Paris <niparisco@gmail.com> 写道:
> 
> Hi
> 
> I wonder the differences accessing HIVE tables in two different ways:
> - with jdbc access
> - with sparkContext
> 
> I would say that jdbc is better since it uses HIVE that is based on
> map-reduce / TEZ and then works on disk. 
> Using spark rdd can lead to memory errors on very huge datasets.
> 
> 
> Anybody knows or can point me to relevant documentation ?
> 
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org


The jdbc will load data into the driver node, this may slow down the speed,and may OOM.


---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Mime
View raw message