spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Loughran <ste...@cloudera.com.INVALID>
Subject Re: -Phadoop-provided still includes hadoop jars
Date Mon, 09 Nov 2020 20:10:55 GMT
On Mon, 12 Oct 2020 at 19:06, Sean Owen <srowen@gmail.com> wrote:

> I don't have a good answer, Steve may know more, but from looking at
> dependency:tree, it looks mostly like it's hadoop-common that's at issue.
> Without -Phive it remains 'provided' in the assembly/ module, but -Phive
> causes it to come back in. Either there's some good reason for that, or,
> maybe we need to explicitly manage the scope of hadoop-common along with
> everything else Hadoop, even though Spark doesn't reference it directly.
> '
>

sorry, missed this.

Yes, they should be scoped so that hadoop-provided leaves them out. Open a
JIRA, and point me at it and I'll do my best.

The artifacts should just go into the hadoop-provided scope, shouldn't they?


> On Mon, Oct 12, 2020 at 12:38 PM Kimahriman <adamq43@gmail.com> wrote:
>
>> When I try to build a distribution with either -Phive or -Phadoop-cloud
>> along
>> with -Phadoop-provided, I still end up with hadoop jars in the
>> distribution.
>>
>> Specifically, with -Phive and -Phadoop-provided, you end up with
>> hadoop-annotations, hadoop-auth, and hadoop-common included in the Spark
>> jars, and with -Phadoop-cloud and -Phadoop-provided, you end up with
>> hadoop-annotations, as well as the hadoop-{aws,azure,openstack} jars. Is
>> this supposed to be the case or is there something I'm doing wrong? I just
>> want the spark-hive and spark-hadoop-cloud jars without the hadoop
>> dependencies, and right now I just have to delete the hadoop jars after
>> the
>> fact.
>>
>>
>>
>> --
>> Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/
>>
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>>
>>

Mime
View raw message