spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Armbrust (JIRA)" <>
Subject [jira] [Commented] (SPARK-27911) PySpark Packages should automatically choose correct scala version
Date Tue, 16 Jul 2019 21:43:00 GMT


Michael Armbrust commented on SPARK-27911:

You are right, there is nothing pyspark specific about this. I just used pyspark as an example
as I think they are more likely to use {{pip}} and thus never even see the scala version they
are using.

It would be great if we could make this easier for all Spark users, preferably by automatically
correcting mismatches.

> PySpark Packages should automatically choose correct scala version
> ------------------------------------------------------------------
>                 Key: SPARK-27911
>                 URL:
>             Project: Spark
>          Issue Type: New Feature
>          Components: PySpark
>    Affects Versions: 3.0.0
>            Reporter: Michael Armbrust
>            Priority: Major
> Today, users of pyspark (and Scala) need to manually specify the version of Scala that
their Spark installation is using when adding a Spark package to their application. This extra
configuration is confusing to users who may not even know which version of Scala they are
using (for example, if they installed Spark using {{pip}}). The confusion here is exacerbated
by releases in Spark that have changed the default from {{2.11}} -> {{2.12}} -> {{2.11}}.
> Since Spark can know which version of Scala it was compiled for, we should give users
the option to automatically choose the correct version.  This could be as simple as a substitution
for {{$scalaVersion}} or something when resolving a package (similar to SBTs support for automatically
handling scala dependencies).
> Here are some concrete examples of users getting it wrong and getting confused:

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message