spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Xiao Li <lix...@databricks.com>
Subject Re: I would like to add JDBCDialect to support Vertica database
Date Wed, 11 Dec 2019 15:41:01 GMT
You can follow how we test the other JDBC dialects. All JDBC dialects
require the docker integration tests.
https://github.com/apache/spark/tree/master/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc


On Wed, Dec 11, 2019 at 7:33 AM Bryan Herger <bryan.herger@microfocus.com>
wrote:

> Hi, to answer both questions raised:
>
>
>
> Though Vertica is derived from Postgres, Vertica does not recognize type
> names TEXT, NVARCHAR, BYTEA, ARRAY, and also handles DATETIME differently
> enough to cause issues.  The major changes are to use type names and date
> format supported by Vertica.
>
>
>
> For testing, I have a SQL script plus Scala and PySpark scripts, but these
> require a Vertica database to connect, so automated testing on a build
> server wouldn’t work.  It’s possible to include my test scripts and
> directions to run manually, but not sure where in the repo that would go.
> If automated testing is required, I can ask our engineers whether there
> exists something like a mockito that could be included.
>
>
>
> Thanks, Bryan H
>
>
>
> *From:* Xiao Li [mailto:lixiao@databricks.com]
> *Sent:* Wednesday, December 11, 2019 10:13 AM
> *To:* Sean Owen <srowen@gmail.com>
> *Cc:* Bryan Herger <bryan.herger@microfocus.com>; dev@spark.apache.org
> *Subject:* Re: I would like to add JDBCDialect to support Vertica database
>
>
>
> How can the dev community test it?
>
>
>
> Xiao
>
>
>
> On Wed, Dec 11, 2019 at 6:52 AM Sean Owen <srowen@gmail.com> wrote:
>
> It's probably OK, IMHO. The overhead of another dialect is small. Are
> there differences that require a new dialect? I assume so and might
> just be useful to summarize them if you open a PR.
>
> On Tue, Dec 10, 2019 at 7:14 AM Bryan Herger
> <bryan.herger@microfocus.com> wrote:
> >
> > Hi, I am a Vertica support engineer, and we have open support requests
> around NULL values and SQL type conversion with DataFrame read/write over
> JDBC when connecting to a Vertica database.  The stack traces point to
> issues with the generic JDBCDialect in Spark-SQL.
> >
> > I saw that other vendors (Teradata, DB2...) have contributed a
> JDBCDialect class to address JDBC compatibility, so I wrote up a dialect
> for Vertica.
> >
> > The changeset is on my fork of apache/spark at
> https://github.com/bryanherger/spark/commit/84d3014e4ead18146147cf299e8996c5c56b377d
> >
> > I have tested this against Vertica 9.3 and found that this changeset
> addresses both issues reported to us (issue with NULL values - setNull() -
> for valid java.sql.Types, and String to VARCHAR conversion)
> >
> > Is the an acceptable change?  If so, how should I go about submitting a
> pull request?
> >
> > Thanks, Bryan Herger
> > Vertica Solution Engineer
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
> >
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>
> --
>
> [image: Databricks Summit - Watch the talks]
> <https://databricks.com/sparkaisummit/north-america>
>


-- 
[image: Databricks Summit - Watch the talks]
<https://databricks.com/sparkaisummit/north-america>

Mime
View raw message