spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bryan Herger <bryan.her...@microfocus.com>
Subject RE: I would like to add JDBCDialect to support Vertica database
Date Wed, 11 Dec 2019 15:28:23 GMT
Hi, to answer both questions raised:

Though Vertica is derived from Postgres, Vertica does not recognize type names TEXT, NVARCHAR,
BYTEA, ARRAY, and also handles DATETIME differently enough to cause issues.  The major changes
are to use type names and date format supported by Vertica.

For testing, I have a SQL script plus Scala and PySpark scripts, but these require a Vertica
database to connect, so automated testing on a build server wouldn’t work.  It’s possible
to include my test scripts and directions to run manually, but not sure where in the repo
that would go.  If automated testing is required, I can ask our engineers whether there exists
something like a mockito that could be included.

Thanks, Bryan H

From: Xiao Li [mailto:lixiao@databricks.com]
Sent: Wednesday, December 11, 2019 10:13 AM
To: Sean Owen <srowen@gmail.com>
Cc: Bryan Herger <bryan.herger@microfocus.com>; dev@spark.apache.org
Subject: Re: I would like to add JDBCDialect to support Vertica database

How can the dev community test it?

Xiao

On Wed, Dec 11, 2019 at 6:52 AM Sean Owen <srowen@gmail.com<mailto:srowen@gmail.com>>
wrote:
It's probably OK, IMHO. The overhead of another dialect is small. Are
there differences that require a new dialect? I assume so and might
just be useful to summarize them if you open a PR.

On Tue, Dec 10, 2019 at 7:14 AM Bryan Herger
<bryan.herger@microfocus.com<mailto:bryan.herger@microfocus.com>> wrote:
>
> Hi, I am a Vertica support engineer, and we have open support requests around NULL values
and SQL type conversion with DataFrame read/write over JDBC when connecting to a Vertica database.
 The stack traces point to issues with the generic JDBCDialect in Spark-SQL.
>
> I saw that other vendors (Teradata, DB2...) have contributed a JDBCDialect class to address
JDBC compatibility, so I wrote up a dialect for Vertica.
>
> The changeset is on my fork of apache/spark at https://github.com/bryanherger/spark/commit/84d3014e4ead18146147cf299e8996c5c56b377d
>
> I have tested this against Vertica 9.3 and found that this changeset addresses both issues
reported to us (issue with NULL values - setNull() - for valid java.sql.Types, and String
to VARCHAR conversion)
>
> Is the an acceptable change?  If so, how should I go about submitting a pull request?
>
> Thanks, Bryan Herger
> Vertica Solution Engineer
>
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org<mailto:dev-unsubscribe@spark.apache.org>
>

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org<mailto:dev-unsubscribe@spark.apache.org>
--
[Databricks Summit - Watch the talks]<https://databricks.com/sparkaisummit/north-america>
Mime
View raw message