spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maciej Szymkiewicz <mszymkiew...@gmail.com>
Subject Re: [PySpark] Revisiting PySpark type annotations
Date Tue, 04 Aug 2020 21:31:41 GMT
Indeed, though the possible advantage is that in theory, you can have
different release cycle than for the main repo (I am not sure if that's
feasible in practice or if that was the intention).

I guess all depends on how we envision the future of annotations
(including, but not limited to, how conservative we want to be in the
future). Which is probably something that should be discussed here.

On 8/4/20 11:06 PM, Felix Cheung wrote:
> So IMO maintaining outside in a separate repo is going to be harder.
> That was why I asked.
>
>
>  
> ------------------------------------------------------------------------
> *From:* Maciej Szymkiewicz <mszymkiewicz@gmail.com>
> *Sent:* Tuesday, August 4, 2020 12:59 PM
> *To:* Sean Owen
> *Cc:* Felix Cheung; Hyukjin Kwon; Driesprong, Fokko; Holden Karau;
> Spark Dev List
> *Subject:* Re: [PySpark] Revisiting PySpark type annotations
>  
>
> On 8/4/20 9:35 PM, Sean Owen wrote
> > Yes, but the general argument you make here is: if you tie this
> > project to the main project, it will _have_ to be maintained by
> > everyone. That's good, but also exactly I think the downside we want
> > to avoid at this stage (I thought?) I understand for some
> > undertakings, it's just not feasible to start outside the main
> > project, but is there no proof of concept even possible before taking
> > this step -- which more or less implies it's going to be owned and
> > merged and have to be maintained in the main project.
>
>
> I think we have a bit different understanding here ‒ I believe we have
> reached a conclusion that maintaining annotations within the project is
> OK, we only differ when it comes to specific form it should take.
>
> As of POC ‒ we have stubs, which have been maintained over three years
> now and cover versions between 2.3 (though these are fairly limited) to,
> with some lag, current master.  There is some evidence there are used in
> the wild
> (https://github.com/zero323/pyspark-stubs/network/dependents?package_id=UGFja2FnZS02MzU1MTc4Mg%3D%3D),
> there are a few contributors
> (https://github.com/zero323/pyspark-stubs/graphs/contributors) and at
> least some use cases (https://stackoverflow.com/q/40163106/). So,
> subjectively speaking, it seems we're already beyond POC.
>
> -- 
> Best regards,
> Maciej Szymkiewicz
>
> Web: https://zero323.net
> Keybase: https://keybase.io/zero323
> Gigs: https://www.codementor.io/@zero323
> PGP: A30CEF0C31A501EC
>
>
-- 
Best regards,
Maciej Szymkiewicz

Web: https://zero323.net
Keybase: https://keybase.io/zero323
Gigs: https://www.codementor.io/@zero323
PGP: A30CEF0C31A501EC


Mime
View raw message