gora-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Henry Saputra <henry.sapu...@gmail.com>
Subject Re: Workings of Hadoop Shims
Date Wed, 25 Feb 2015 06:25:11 GMT
The gora-shims-distribution have optional dependencies on Hadoop-2
which should be ok.

Lewis, could you try update gora-core/pom.xml to add optional to be
true for the hadoop-client dependency:

<dependency>
  <groupId>org.apache.hadoop</groupId>
  <artifactId>hadoop-client</artifactId>
  <optional>true</optional>
</dependency>

- Henry


On Sun, Feb 22, 2015 at 3:52 PM, Lewis John Mcgibbney
<lewis.mcgibbney@gmail.com> wrote:
> Hi Folks,
> I'm kicking off this overdue thread to obtain good understanding of exactly
> whats going on with the Hadoop Shims. The documentation is lacking at the
> moment and I am therefore putting time in to rectifying this.
> My humble beginnings are in progress below
> http://gora.apache.org/current/gora-shims.html
>
> Scenario - Upgrade Nutch 2.3.1-SNAPSHOT to Gora 0.6
> Jira Issue - https://issues.apache.org/jira/browse/NUTCH-1946
> Observations - From my initial analysis of the current state of the Shims,
> here are some initial observations
>
>    - gora-shims-distribution relies upon gora-shims-hadoop,
>    gora-shims-hadoop1 and gora-shims-hadoop2
>    - gora-shims-hadoop provides a parent for gora-shims-hadoop1 and
>    gora-shims-hadoop2, however it also had direct dependencies upon the
>    following
>    - org.apache.hadoop:hadoop-client:jar:2.5.2:compile
>       - org.apache.hadoop:hadoop-hdfs:jar:2.5.2:compile
>       - org.apache.hadoop:hadoop-mapreduce-client-app:jar:2.5.2:compile
>       - org.apache.hadoop:hadoop-yarn-api:jar:2.5.2:compile
>       - org.apache.hadoop:hadoop-mapreduce-client-core:jar:2.5.2:compile
>       -
>       org.apache.hadoop:hadoop-mapreduce-client-jobclient:jar:2.5.2:compile
>       - org.apache.hadoop:hadoop-annotations:jar:2.5.2:compile
>
>
>    - As stated above, both gora-shims-hadoop1 and gora-shims-hadoop2 depend
>    upon gora-shims-hadoop with the difference being that gora-shims-hadoop1
>    then defines hadoop 1.X dependencies.
>
> Problems - I understand that we have upgraded to Hadoop 2.5.2 by default.
> This is great. What I am failing to get a grasp on however is exactly how
> we provide guidance on upgrade to Gora 0.6 without upgrades from Hadoop
> 1.2.X --> 2.5.X?
>
> Bearing in mind that gora-core depends upon gora-shims-hadoop therefore
> Hadoop 2.5.2 dependencies are automatically fetched in a transitive fashion
> whenever we with to upgrade gora-core dependency from 0.5 --> 0.6.
>
> I am going to experiment with using a bunch of exclusions in my pom.xml
> under the gora-shims-hadoop dependency e.g exclude all above Hadoop
> dependencies, then explicitly add the gora-shims-hadoop1 dependency.
>
> What is making this worse, is that I cannot create profiles for this
> upgrade as I would be able to do in a Maven project because I am working
> with Ant + Ivy.
>
> Any thoughts would be very much appreciated. Essentially whatever we
> discuss here is creation the foundation for the Gora Shims documentation so
> it would be very much appreciated.
>
> Thanks
>
> Lewis
>
> --
> *Lewis*

Mime
View raw message