hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Giridharan Kesavan (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-5107) split the core, hdfs, and mapred jars from each other and publish them independently to the Maven repository
Date Fri, 25 Sep 2009 09:32:16 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-5107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12759460#action_12759460

Giridharan Kesavan commented on HADOOP-5107:

bq. The patch doesn't work when we go off-line for subsequent runs. The off-line feature is
missing in all the projects. Without this feature, it tries to download maven-ant-tasks.jar
itself again and gets stuck. 

ivy doesnt work offline. Everytime we do a build whether the dependencies are present in the
cache or not it goes and verifies the repo. If the dependencies are present locally it doesn't
download. Same is the case with mvn-ant-task.jar. It doesnt download the jar everytime as
usetimestamp is set to true.

bq. In many files, in particular the ivy.xml files of contrib projects, most of the changes
are not required and are redundant as the patch removes them and simply adds them again changing
the format into a single line. Undoing these changes will greatly reduce the patch size 

When dependencies are put in a single line the ivy.xml file looks refined and re-formatting
would greatly help in understanding. 
bq. In mapreduce and hdfs ivy.xml files, some cleanup is done. The earlier client and server
specific dependencies looked good and natural too. Did you remove that because the classification
was premature or it didn't gel well with your changes? 

This patch uses maven and ivy for publishing and resolving resp. Ivy work's on configuration
while maven works on scope. I 've tried my best to utilize best of both the worlds.

bq. mapreduce build.xml: Do we need separate mvn-install and mvn-install-mapred? Even if it
is needed, mvn-install should depend on mvn-install-mapred. A case of reuse.  
Until last couple of days hdfs depended on both mapred and common. And mapred depended on
hdfs and common.  Hence we had a situation to publish only mapred and hdfs jar and not the
corresponding test jars. I didn't want to re-use the mvn-install-mapred target as I was expected
to cleanup this target once the circular dependency issue is resolved.

bq. common project: Should we take this as an opportunity and rename the core jar to common
jar before publishing? It looks odd the project name is common while the jar's name refers
to core. 
That would be quite a work and I would defn. want that to be in a diff jira.

bq. I think that in both mapred and hdfs, clean-cache should not delete the whole ${user.home}/.ivy2/cache/org.apache.hadoop/hadoop-core
directory for example. It works for now, but different projects may work with different versions
of the jar, so mapred's clean-cache should only delete the corresponding version of the jar.
Same with the other directories in the cache. Thoughts? 
Its not just the jar files that the cache stores, it also converts the poms and stores them
as ivy.xml files for different ivy configurations. And the best way to clean them up is to
clean the corresponding artifact folder in the cache.

bq.Should `ant clean` delete maven-ant-tasks.jar every time? I guess not. 
When I call ant clean I would defn. expect a clean workspace. 
Also there is a different reason. I ve seen ppl doing a ctrl-c half way when the ivy/maven-ant-task.
jar is downloading. So the jar is partially downloaded. Next time when a user runs the build
and the build fails for the jar file being corrupt, they have to go delete them manually.

Thanks for the comments.

> split the core, hdfs, and mapred jars from each other and publish them independently
to the Maven repository
> ------------------------------------------------------------------------------------------------------------
>                 Key: HADOOP-5107
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5107
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: build
>    Affects Versions: 0.20.0
>            Reporter: Owen O'Malley
>            Assignee: Giridharan Kesavan
>         Attachments: common-trunk-v1.patch, common-trunk-v4.patch, common-trunk.patch,
hadoop-hdfsd-v4.patch, hdfs-trunk-v1.patch, hdfs-trunk-v2.patch, hdfs-trunk.patch, mapred-trunk-v1.patch,
mapred-trunk-v2.patch, mapred-trunk-v3.patch, mapred-trunk-v4.patch, mapred-trunk-v5.patch,
> I think to support splitting the projects, we should publish the jars for 0.20.0 as independent
jars to the Maven repository 

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message