spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Priyanka Gomatam <Priyanka.Goma...@microsoft.com.INVALID>
Subject RE: JDBC connector for DataSourceV2
Date Tue, 16 Jul 2019 00:16:09 GMT
I would have thought one of the most important goals would be pushing down limits since V2
supports it.

I am also interested in collaborating. Thanks!

Priyanka Gomatam

From: Shiv Prashant Sood <shivprashant@gmail.com>
Sent: Monday, July 15, 2019 10:22 AM
To: Gabor Somogyi <gabor.g.somogyi@gmail.com>
Cc: Xianyin Xin <xianyin.xxy@alibaba-inc.com>; Ryan Blue <rblue@netflix.com>;
gengliang.wang@databricks.com; Spark Dev List <dev@spark.apache.org>
Subject: Re: JDBC connector for DataSourceV2

Agree. Let's use SPARK-24907<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FSPARK-24907&data=02%7C01%7CPriyanka.Gomatam%40microsoft.com%7Ce40bb48f96de41aad82408d70948fa94%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636988081362796637&sdata=fg5iQwBwWwZ6BonaijR%2FTJ%2FiKBTsNbE8XOFGN6Y5VCs%3D&reserved=0>
as the JIRA for this work. Thanks for resolving SPARK-28380<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FSPARK-28380&data=02%7C01%7CPriyanka.Gomatam%40microsoft.com%7Ce40bb48f96de41aad82408d70948fa94%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636988081362806632&sdata=a9VYhhEXeiDC4Pj2KXIWzOr2BBQ2mXORotgfviDTBkE%3D&reserved=0>
as dupe of this.

Regards,
Shiv

On Mon, Jul 15, 2019 at 1:50 AM Gabor Somogyi <gabor.g.somogyi@gmail.com<mailto:gabor.g.somogyi@gmail.com>>
wrote:
I've had a look at the jiras and seems like the intention is the same (correct me if I'm wrong).
I think one is enough and the rest can be closed with duplicate.
We should keep multiple jiras only when the intention is different.

BR,
G


On Mon, Jul 15, 2019 at 6:01 AM Xianyin Xin <xianyin.xxy@alibaba-inc.com<mailto:xianyin.xxy@alibaba-inc.com>>
wrote:
There’s another pr https://github.com/apache/spark/pull/21861<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fspark%2Fpull%2F21861&data=02%7C01%7CPriyanka.Gomatam%40microsoft.com%7Ce40bb48f96de41aad82408d70948fa94%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636988081362806632&sdata=asUJggik%2B8bWUNCUR6NL1%2Bf2FDtF%2BZoiK5c23z0tHx8%3D&reserved=0>
but which is based the old V2 APIs.

We’d better link the JIRAs, SPARK-24907<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FSPARK-24907&data=02%7C01%7CPriyanka.Gomatam%40microsoft.com%7Ce40bb48f96de41aad82408d70948fa94%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636988081362806632&sdata=i1%2BRs0ShcH0IfS%2FT8nXJZMHvpeOuWHCT3F%2BSZkJeoxg%3D&reserved=0>,
SPARK-25547<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FSPARK-25547&data=02%7C01%7CPriyanka.Gomatam%40microsoft.com%7Ce40bb48f96de41aad82408d70948fa94%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636988081362816625&sdata=QbdzhA5l4ZDjsUZK96UqrB0numqMjwnMVB0xr2c4WQI%3D&reserved=0>,
and SPARK-28380<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FSPARK-28380&data=02%7C01%7CPriyanka.Gomatam%40microsoft.com%7Ce40bb48f96de41aad82408d70948fa94%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636988081362816625&sdata=wWFRcjFMPbOf9ThaTKu2IY1tvzgTm0fXeRtTXgBYgVA%3D&reserved=0>
and finalize a plan.

Xianyin

From: Shiv Prashant Sood <shivprashant@gmail.com<mailto:shivprashant@gmail.com>>
Date: Sunday, July 14, 2019 at 2:59 AM
To: Gabor Somogyi <gabor.g.somogyi@gmail.com<mailto:gabor.g.somogyi@gmail.com>>
Cc: Xianyin Xin <xianyin.xxy@alibaba-inc.com<mailto:xianyin.xxy@alibaba-inc.com>>,
Ryan Blue <rblue@netflix.com<mailto:rblue@netflix.com>>, <gengliang.wang@databricks.com<mailto:gengliang.wang@databricks.com>>,
Spark Dev List <dev@spark.apache.org<mailto:dev@spark.apache.org>>
Subject: Re: JDBC connector for DataSourceV2

To me this looks like refactoring of DS1 JDBC to enable user provided connection factories.
In itself a good change, but IMO not DSV2 related.

I created a JIRA and added some goals. Please comments/add as relevant.

https://issues.apache.org/jira/browse/SPARK-28380<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FSPARK-28380&data=02%7C01%7CPriyanka.Gomatam%40microsoft.com%7Ce40bb48f96de41aad82408d70948fa94%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636988081362826620&sdata=JgT2Kyi4paiK01aLwggBg0d5lfen%2BBwQminOR1PlprM%3D&reserved=0>

JIRA for DataSourceV2 API based JDBC connector.

Goals :

  *   Generic connector based on JDBC that supports all databases (min bar is support for
all V1 data bases).
  *   Reference implementation and Interface for any specialized JDBC connectors.

Regards,
Shiv

On Sat, Jul 13, 2019 at 2:17 AM Gabor Somogyi <gabor.g.somogyi@gmail.com<mailto:gabor.g.somogyi@gmail.com>>
wrote:
Hi Guys,

Don't know what's the intention exactly here but there is such a PR: https://github.com/apache/spark/pull/22560<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fspark%2Fpull%2F22560&data=02%7C01%7CPriyanka.Gomatam%40microsoft.com%7Ce40bb48f96de41aad82408d70948fa94%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636988081362826620&sdata=i0Fc5PT0hYyki99JuldhryMpoSWHGUnoKgubwoM9Woo%3D&reserved=0>
If that's what we need maybe we can resurrect it. BTW, I'm also interested in...

BR,
G


On Sat, Jul 13, 2019 at 4:09 AM Shiv Prashant Sood <shivprashant@gmail.com<mailto:shivprashant@gmail.com>>
wrote:
Thanks all. I can also contribute toward this effort.

Regards,
Shiv
Sent from my iPhone

On Jul 12, 2019, at 6:51 PM, Xianyin Xin <xianyin.xxy@alibaba-inc.com<mailto:xianyin.xxy@alibaba-inc.com>>
wrote:
If there’s nobody working on that, I’d like to contribute.

Loop in @Gengliang Wang.

Xianyin

From: Ryan Blue <rblue@netflix.com.INVALID<mailto:rblue@netflix.com.INVALID>>
Reply-To: <rblue@netflix.com<mailto:rblue@netflix.com>>
Date: Saturday, July 13, 2019 at 6:54 AM
To: Shiv Prashant Sood <shivprashant@gmail.com<mailto:shivprashant@gmail.com>>
Cc: Spark Dev List <dev@spark.apache.org<mailto:dev@spark.apache.org>>
Subject: Re: JDBC connector for DataSourceV2

I'm not aware of a JDBC connector effort. It would be great to have someone build one!

On Fri, Jul 12, 2019 at 3:33 PM Shiv Prashant Sood <shivprashant@gmail.com<mailto:shivprashant@gmail.com>>
wrote:
Can someone please help understand the current Status of DataSource V2 based JDBC connector?
I see connectors for various file formats in Master, but can't find a JDBC implementation
or related JIRA.

DatasourceV2 APIs to me look in good shape to attempt a JDBC connector for READ/WRITE path.
Thanks & Regards,
Shiv


--
Ryan Blue
Software Engineer
Netflix
Mime
View raw message