spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mohammed Guller <moham...@glassbeam.com>
Subject RE: using a database connection pool to write data into an RDBMS from a Spark application
Date Fri, 20 Feb 2015 17:20:13 GMT
Sean,
I know that Class.forName is not required since Java 1.4 :-) It was just a desperate attempt
 to make sure that the Postgres driver is getting loaded. Since Class.forName("org.postgresql.Driver")
is not throwing an exception, I assume that the driver is available in the classpath. Is that
not true?

I did some more troubleshooting and here is what I found:
1) The hive libraries used by Spark use BoneCP 0.7.1
2) When Spark master is started, it initializes BoneCP, which will not load any database driver
at that point (that makes sense)
3) When my application initializes BoneCP, it thinks it is already initialized and does not
load the Postgres driver ( this is a known bug in 0.7.1). This bug is fixed in BoneCP 0.8.0
release.

So I linked my app with BoneCP 0.8.0 release, but when I run my app using spark-submit, Spark
continues to use BoneCP 0.7.1. How do I override that behavior? How do I make spark-submit
script unload BoneCP 0.7.1 and load BoneCP 0.8.0? I tried the --jars and --driver-classpath
flags, but it didn't help. 

Thanks,
Mohammed


-----Original Message-----
From: Sean Owen [mailto:sowen@cloudera.com] 
Sent: Friday, February 20, 2015 2:06 AM
To: Mohammed Guller
Cc: Kelvin Chu; user@spark.apache.org
Subject: Re: using a database connection pool to write data into an RDBMS from a Spark application

Although I don't know if it's related, the Class.forName() method of loading drivers is very
old. You should be using DataSource and javax.sql; this has been the usual practice since
about Java 1.4.

Why do you say a different driver is being loaded? that's not the error here.

Try instantiating the driver directly to test whether it's available in the classpath. Otherwise
you would have to check whether the jar exists, the class exists in it, and it's really on
your classpath.

On Fri, Feb 20, 2015 at 5:27 AM, Mohammed Guller <mohammed@glassbeam.com> wrote:
> Hi Kelvin,
>
>
>
> Yes. I am creating an uber jar with the Postgres driver included, but 
> nevertheless tried both –jars and –driver-classpath flags. It didn’t help.
>
>
>
> Interestingly, I can’t use BoneCP even in the driver program when I 
> run my application with spark-submit. I am getting the same exception 
> when the application initializes BoneCP before creating SparkContext. 
> It looks like Spark is loading a different version of the Postgres 
> JDBC driver than the one that I am linking.
>
>
>
> Mohammed
>
>
>
> From: Kelvin Chu [mailto:2dot7kelvin@gmail.com]
> Sent: Thursday, February 19, 2015 7:56 PM
> To: Mohammed Guller
> Cc: user@spark.apache.org
> Subject: Re: using a database connection pool to write data into an 
> RDBMS from a Spark application
>
>
>
> Hi Mohammed,
>
>
>
> Did you use --jars to specify your jdbc driver when you submitted your job?
> Take a look of this link:
> http://spark.apache.org/docs/1.2.0/submitting-applications.html
>
>
>
> Hope this help!
>
>
>
> Kelvin
>
>
>
> On Thu, Feb 19, 2015 at 7:24 PM, Mohammed Guller 
> <mohammed@glassbeam.com>
> wrote:
>
> Hi –
>
> I am trying to use BoneCP (a database connection pooling library) to 
> write data from my Spark application to an RDBMS. The database inserts 
> are inside a foreachPartition code block. I am getting this exception 
> when the code tries to insert data using BoneCP:
>
>
>
> java.sql.SQLException: No suitable driver found for 
> jdbc:postgresql://hostname:5432/dbname
>
>
>
> I tried explicitly loading the Postgres driver on the worker nodes by 
> adding the following line inside the foreachPartition code block:
>
>
>
> Class.forName("org.postgresql.Driver")
>
>
>
> It didn’t help.
>
>
>
> Has anybody able to get a database connection pool library to work 
> with Spark? If you got it working, can you please share the steps?
>
>
>
> Thanks,
>
> Mohammed
>
>
>
>
Mime
View raw message