uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marshall Schor <...@schor.com>
Subject Re: progress in Nexus / Pom updates
Date Thu, 06 May 2010 19:03:23 GMT
Another small gotcha - I found that running maven 2.2.1 on one of the
new docbooks, like uima-docbook-overview-and-setup, runs a couple of
maven-site-plugin things as part of the build, which are not called for.

This strange behavior is not present in Maven 3 beta-1.

So I recommend switching to that :-)

-Marshall

On 5/5/2010 5:06 PM, Marshall Schor wrote:
> I broke the new build - so I hope no one is trying anything with it. 
> One cause was a mind-confusion caused by:
>
>   old- had uima-docbook-tool project holding our docbook building
> pipeline tooling
>   new- had uima-docbook-tools project - being one of the 4 books (tools)
> put into a separate project. 
>
> I thought I was deleting the no-longer-used dockbook building pipeline
> tooling, but accidently deleted the other one... (sigh...) but of course
> SVN lets me get it back :-)
>
> -Marshall
>
> On 5/5/2010 3:34 PM, Marshall Schor wrote:
>   
>> I think that the uimaj part is now done (at least I don't know of more
>> bugs).
>>
>> The latest round was to get the distr and eclipse update site stuff working.
>>
>> The felix-bundle builds were altered to run the goal "manifest", which
>> only produces the manifest.  The packaging type was changed to "jar",
>> and the normal Apache jar release stuff which copies in the required
>> license and notice headers, now works.
>>
>> I also adopted the osgiVersion approach to naming the Eclipse plugins,
>> while keeping the maven versions using the standard Maven conventions. 
>> We'll know if that's the right approach eventually, when we try out the
>> release plugin :-).
>>
>> Most of the inter-project dependencies are now eliminated - this means
>> you should be able to check out one project, and do mvn install and it
>> should build :-).
>>
>> Exceptions are the aggregating projects - because their <module>
>> elements refer to the modules by relative path, and the -distr projects,
>> because they find the things being distributed using relative paths.
>>
>> Next is uima-as - I'll work on a branch, again, for this.
>>
>> -Marshall
>>
>> On 4/30/2010 2:13 PM, Marshall Schor wrote:
>>   
>>     
>>> The next chapter in this saga... (apologies for the long post). If you
>>> won't be writing docbooks or your docbooks won't be cross-referenced to
>>> any other docbooks in the uima bookshelf, then you can skip reading
>>> this, unless you want to be entertained :-) .
>>>
>>> This is all about olinks. Olinks allow cross-referencing and
>>> hyperlinking among documents, using extra saved information about the
>>> target document being linked to (as contrasted with plain href style
>>> links, which only have the link url). For instance, in PDFs, there's
>>> extra info enabling the referring doc to say "page 123 in document abc".
>>> For PDF and HTML, it allows the referring text to include a hyperlink
>>> with the text begin the target document's title, and maybe number (if it
>>> has numbered items - such as our chapter / section numbers in the main
>>> UIMA documentation). So you can get a link that looks like this:
>>>
>>> see Section 1.5.1, “Annotator Methods”
>>> <http://uima.apache.org/downloads/releaseDocs/2.3.0-incubating/docs/html/tutorials_and_users_guides/tutorials_and_users_guides.html#ugr.tug.aae.contract_for_annotator_methods>
>>> for ...
>>>
>>> where the 1.5.1 was generated by docbook processing, and the "Annotator
>>> Methods" was the title of that section.
>>>
>>> To make olinks work, each time a docbook is processed, an extra database
>>> of info for that docbook is created, containing just the info needed for
>>> this. This database, together with some other data about how the
>>> multiple interlinking docbooks are arranged, is needed when processing a
>>> docbook, to resolve these things.
>>>
>>> So - where to store this information? We previously had stored this in
>>> SVN. This was unsatisfactory because it caused interdependencies among
>>> checked-out projects, where one project (having these databases) had to
>>> be checked out into a specific, fixed directory layout with respect to
>>> other (using) projects. The Maven way to get around this is to put these
>>> things into the maven repository.
>>>
>>> Since there's one database per docbook, I though it could best be stored
>>> as an additional maven attached file for the project. Then you could
>>> "depend" on the project, and download that artifact. This would place a
>>> burden on docbook users - they would need to specify additional POM info
>>> to get these things downloaded.
>>>
>>> So I tried that, and it worked fine for individual book processing. Then
>>> I tried using an aggregator POM specifying the 4 main UIMA docbooks (now
>>> moved to separate projects), and since these all refer (that is, olink )
>>> to each other, this violated a maven principle of no circular dependency
>>> relationships. These really are circular relationships, but they resolve
>>> when you run docbook multiple times :-) .
>>>
>>> To fix this, I went to a scheme where there is just one additional
>>> project (I'm calling it uima-docbook-olink-dbs) that will have just one
>>> attached artifact, a zip file of all the needed docbook olink data for
>>> all the docbooks in UIMA. (This could be a large set - besides the 4
>>> main books, we have one for uima-as, and there's other books for many of
>>> the sandbox projects, and one for some special tooling - like the
>>> PearPackagingMavenPlugin).
>>>
>>> This project is at the level 1-SNAPSHOT, and I think it will stay there.
>>> This is because it's always being updated in part by each docbook
>>> processing run, and we currently don't have a concept of needing any but
>>> the latest versions of things. Note that releases will capture the
>>> result of using the then current (at the time of the build) version of
>>> these databases. I could imagine some fancy use cases that might not be
>>> well supported - such as working on several versions at once, but I'll
>>> let those use cases materialize first before trying to address them :-)
>>> . Here's how this set of olink data will be used.
>>>
>>> 1) new users start by checking out and running a build which invokes our
>>> docbook processing. This uses the dependency:unpack goal to find this
>>> artifact in the maven snapshot repo (in the Apache infrastructure's
>>> version of Nexus), where it lives - it will have the latest "deployed"
>>> (that is, uploaded) set of olink data, for all docbooks that are using
>>> olink.
>>>
>>> 2) The dependency:unpack will first download this zip to the local repo
>>> if it isn't already there. If it is already there, it will check to see
>>> if the snapshot in the repository is newer, and if so, will download
>>> that. It then unpacks that to a spot where all projects being built on
>>> this workstation for this user can find it.
>>>
>>> 3) The rest of the docbook build uses this olink data, and also, as a
>>> side-effect of running on a particular document, adds or updates the
>>> existing olink data for the current document being processed.
>>>
>>> In thinking about where to store the unzipped form of the olink
>>> databases, I hit upon the idea of storing it in the local .m2 repo, in
>>> the uima-docbook-olink-dbs project, but as an additional directory
>>> (called docbook-olink) which is *not* attached - so it won't be uploaded.
>>>
>>> This has a couple of nice side effects - once installed and unzipped,
>>> unless someone else "deploys" an update of this data to the snapshot
>>> repo, the download and unpack steps can be somewhat skipped. And,
>>> whenever someone doing some docbook builds is happy with their results,
>>> they've as a side effect been creating additional or updated olink info
>>> for one or more books, and to make these available to others, they just
>>> need to "deploy" these back to the snapshot repository. (Note that that
>>> deploy step runs a POM which first gets any updates made by someone else
>>> for other docbooks, that might have happened in the meanwhile, so what's
>>> uploaded is the latest version of all docbooks (except for collisions
>>> where two have checked out the and are processing the very same docbook
>>> - in which case the last one wins...).
>>>
>>> Testing revealed that this seems to work, with one exception - when I
>>> ran the deploy from within m2eclipse, it nicely uploaded the POM , but
>>> gave a message about Failed to Upload [400]. After much googling that
>>> didn't identify the issue, I tried this from the command line, and it
>>> worked. A few more tries isolated this to an apparent issue in the
>>> "built-in" version of maven that m2eclipse 0.0.10 uses, which is a
>>> version of maven 3.0-alpha-6. I found that maven 2.2.1 and 3.0-beta-1
>>> both work, even when run from m2eclipse. So if you are using m2eclipse,
>>> I recommend you
>>> 1) use the maven preferences to install a link to 3.0-beta-1, and set it
>>> as the default
>>> 2) if previous use of m2eclipse created any run configurations, you have
>>> to manually update each one of those - there's a menu pull-down at the
>>> bottom of the main run configuration page for each one, labeled "Maven
>>> Runtime", where you can switch this.
>>>
>>> Next steps will be verifying that the overwrite-if-newer is working for
>>> using dependency:unpack for individual unpacked files, then I'll
>>> probably go and check a bunch of this in :-)
>>>
>>> I've started writing a new web page for our site describing how to do
>>> docbooks, the uima bookshelf concept, etc., which I'll need to update...
>>>
>>> -Marshall
>>>
>>>
>>>
>>>
>>> On 4/26/2010 11:17 PM, Marshall Schor wrote:
>>>   
>>>     
>>>       
>>>> Docbook story:  Most of the afternoon was spent tracking down a bug,
>>>> which turned out to be formerly hidden by Maven 2.2.1, but which Maven
>>>> 3.0 exposed (I'm trying Maven 3.0 beta 1 - it seems to run faster/better
>>>> :-) ).  The symptom was a report that the "catalog file" could not be found.
>>>>
>>>> The bug is that if you ask in a plugin to load a resource at the top
>>>> level, using the string "/xxx.xml" for instance, it fails.  This is
>>>> because that leading "/" makes the Java classloader.getResource(aString)
>>>> fail.  To fix, just drop the leading "/". 
>>>>
>>>> I've reported this along with the fix to the docbkx project - they use
>>>> this to load the "catalog.xml" file that comes with docbook 4.x and 5.0
>>>> distributions. 
>>>>
>>>> So, now, after all that, I'm starting to get docbook building again,
>>>> this time with fully factored parent plugins.  The olink stuff I'm going
>>>> to try to do by using maven "attachments", and going for a strategy of
>>>> only 1 docbook per project (I've split the uima-docbooks project, which
>>>> held 4 docbooks, into 4 projects, each holding one docbook). 
>>>>
>>>> This aligns the approach with the way Sandbox projects are doing
>>>> documentation - they have the project produce the 1 main artifact (a
>>>> jar), and now it will also produce (when I'm fininshed :-) ) an
>>>> additional "attached" artifact - the olink data for the pdf and html
>>>> versions. 
>>>>
>>>> This will allow other docbooks which want to hyperlink to a reference in
>>>> the first docbook to be able to do so. (OLinking is like normal
>>>> hyperlinking, except that information about the target is known, so for
>>>> PDFs, the link includes the "book" + page number in the book, and it
>>>> includes locating the other book via a relative directory path.).
>>>>
>>>> It looks like I'll be able to put all the gorp (that's a technical term
>>>> :-) ) for docbook formatting, like boiler plate, title pages, things to
>>>> enable xInclude, fonts, css stuff,
>>>> customization xsl layers, etc. into a shared "resource bundle" that
>>>> projects will be able to fetch (from their local .m2 repository, or from
>>>> the big repo in the sky).
>>>>
>>>> -Marshall
>>>>
>>>> On 4/22/2010 4:03 PM, Marshall Schor wrote:
>>>>   
>>>>     
>>>>       
>>>>         
>>>>> progress -
>>>>>
>>>>> the uimaj/branches/mavenAlign branch should now build all of the Java
>>>>> components.  There are 2 new aggregate (only) POMs for this, to build
in
>>>>> batch, called aggregate-pom-uimaj and aggregate-pom-uimaj-eclipse-plugins.
>>>>>
>>>>> More checking to do to verify the build is ok.
>>>>>
>>>>> Next to tackle: docbooks, then the assemblies.
>>>>>
>>>>> -Marshall
>>>>>
>>>>> On 4/19/2010 5:16 PM, Marshall Schor wrote:
>>>>>   
>>>>>     
>>>>>       
>>>>>         
>>>>>           
>>>>>> Progress - created a common eclipse-plugin parent pom, and got the
>>>>>> ep-configurator eclipse project to build.
>>>>>>
>>>>>> I noticed as a side effect of checking things that our 2.3.0 build
for
>>>>>> these artifacts are missing the License, Notice, etc. in the Jar
>>>>>> manifest.  The new structure of parent poms corrects this in a uniform
>>>>>> way :-)
>>>>>>
>>>>>> -Marshall
>>>>>>
>>>>>> On 4/19/2010 10:42 AM, Marshall Schor wrote:
>>>>>>   
>>>>>>     
>>>>>>       
>>>>>>         
>>>>>>           
>>>>>>             
>>>>>>> Progress -
>>>>>>>
>>>>>>> To handle the many Jars that need the extra bit in their Notice
file(s),
>>>>>>> I made a version of the remote-resource "bundle" that includes
a
>>>>>>> placeholder for additional text following the standard NOTICE
boiler plate.
>>>>>>>
>>>>>>> Then I made a version of the parent pom for uimaj (uimaj-ibm-notice)
>>>>>>> which uses this extra remote resource, and sets the additional
text to
>>>>>>> the required boilerplate for those jars which were originally
coming
>>>>>>> from IBM. 
>>>>>>>
>>>>>>> Now, JVinci has the right notice file...
>>>>>>>
>>>>>>> next problems I'm working on for JVinci: The implementation url
is
>>>>>>> incorrect (it's for the parent-pom), and the project title META-INF
>>>>>>> which we used to have, is missing.
>>>>>>>
>>>>>>> -Marshall
>>>>>>>
>>>>>>> On 4/15/2010 5:17 PM, Marshall Schor wrote:
>>>>>>>   
>>>>>>>     
>>>>>>>       
>>>>>>>         
>>>>>>>           
>>>>>>>             
>>>>>>>               
>>>>>>>> Progress -
>>>>>>>>
>>>>>>>> I made a new top-level node in the uima tree called "build"
- for
>>>>>>>> artifacts that we won't normally be including in assemblies,
but which
>>>>>>>> are instead build things.
>>>>>>>>
>>>>>>>> In there, I put a folder called "parent-poms" - the intent
is to keep
>>>>>>>> these organized in one place.
>>>>>>>>
>>>>>>>> I made a top level pom for the whole project, which inherits
from the
>>>>>>>> common Apache pom version 7.  The common Apache pom connects
the deploy
>>>>>>>> / release process with the Nexus repository.
>>>>>>>>
>>>>>>>> I also made a top level pom for just the main UIMA Java SDK
-
>>>>>>>> corresponding sort of to the former uimaj pom, except it
doesn't have
>>>>>>>> any aggregation stuff.
>>>>>>>>
>>>>>>>> BTW, in fiddling with the poms, I'm following the recommended
ordering
>>>>>>>> for elements in the POM, listed here:
>>>>>>>> http://maven.apache.org/developers/conventions/code.html
 (scroll 3/4 of
>>>>>>>> the way toward the bottom)
>>>>>>>>
>>>>>>>> After fiddling with my .m2/settings.xml files per the instructions
on
>>>>>>>> migrating to Nexus, both install and deploy worked (deploy
was for a
>>>>>>>> SNAPSHOT - no real releases :-) ).
>>>>>>>>
>>>>>>>> You can see the deployed artifacts on repository.apache.org
in the
>>>>>>>> Snapshots area.
>>>>>>>>
>>>>>>>> I'm now trying to see how to set up projects whose poms inherit
from
>>>>>>>> uimaj.  First trying jVinci.  I'm comparing what gets built
to what was
>>>>>>>> built for 2.3.0-incubating.
>>>>>>>> One difference - a bunch of our components have slightly
different
>>>>>>>> Notices needed, so I'll fix that.
>>>>>>>>
>>>>>>>> Another thing to fix: thinking about when to run RAT.  Some
projects put
>>>>>>>> it into a profile - so you can run it when you want to. 
It could also
>>>>>>>> be in the apache-release profile - so it's always run when
doing a
>>>>>>>> release candidate.  Unless there's a better idea, I'll add
this.
>>>>>>>>
>>>>>>>> -Marshall
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>   
>>>>>>>>     
>>>>>>>>       
>>>>>>>>         
>>>>>>>>           
>>>>>>>>             
>>>>>>>>               
>>>>>>>>                 
>>>>>>>   
>>>>>>>     
>>>>>>>       
>>>>>>>         
>>>>>>>           
>>>>>>>             
>>>>>>>               
>>>>>>   
>>>>>>     
>>>>>>       
>>>>>>         
>>>>>>           
>>>>>>             
>>>>>   
>>>>>     
>>>>>       
>>>>>         
>>>>>           
>>>>   
>>>>     
>>>>       
>>>>         
>>>   
>>>     
>>>       
>>   
>>     
>
>   

Mime
View raw message