spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Takeshi Yamamuro <linguin....@gmail.com>
Subject Re: JDBC Very Slow
Date Sat, 17 Sep 2016 02:39:24 GMT
Hi,

It'd be better to set `predicates` in jdbc arguments for loading in
parallel.
See:
https://github.com/apache/spark/blob/branch-1.6/sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala#L200

// maropu

On Sat, Sep 17, 2016 at 7:46 AM, Benjamin Kim <bbuild11@gmail.com> wrote:

> I am testing this in spark-shell. I am following the Spark documentation
> by simply adding the PostgreSQL driver to the Spark Classpath.
>
> SPARK_CLASSPATH=/path/to/postgresql/driver spark-shell
>
>
> Then, I run the code below to connect to the PostgreSQL database to query.
> This is when I have problems.
>
> Thanks,
> Ben
>
>
> On Sep 16, 2016, at 3:29 PM, Nikolay Zhebet <phpapple@gmail.com> wrote:
>
> Hi! Can you split init code with current comand? I thing it is main
> problem in your code.
> 16 сент. 2016 г. 8:26 PM пользователь "Benjamin Kim" <bbuild11@gmail.com>
> написал:
>
>> Has anyone using Spark 1.6.2 encountered very slow responses from pulling
>> data from PostgreSQL using JDBC? I can get to the table and see the schema,
>> but when I do a show, it takes very long or keeps timing out.
>>
>> The code is simple.
>>
>> val jdbcDF = sqlContext.read.format("jdbc").options(
>>     Map("url" -> "jdbc:postgresql://dbserver:po
>> rt/database?user=user&password=password",
>>    "dbtable" -> “schema.table")).load()
>>
>> jdbcDF.show
>>
>>
>> If anyone can help, please let me know.
>>
>> Thanks,
>> Ben
>>
>>
>


-- 
---
Takeshi Yamamuro

Mime
View raw message