lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <>
Subject Re: [VOTE] Lucene and Solr 3.1 release candidate
Date Thu, 10 Mar 2011 23:49:47 GMT

: Artifacts are located here:

Finally got a chance to look at 3.1 rc0. 

my comments are below -- they are in the order i encountered them ... 
stream of conciousness.  Note that this was my first real chance to look 
at the new packaging work folks have been doing to deal with the merged 
dev tree (ie: the solr source only packaging, and how we include the 
lucene jars in solr releases, etc...).  I also have to confess being 
woefully behind on reading hte dev list and jira, so forgive me if any of 
this has already been discussed / dealt with...

The short summary, i'm a -1 for these RC0 packages....

I focused on the *.tgz files, assuming (for now) that the zip files are 

I started with apache-solr-3.1.0-src.tgz

The first thing that jumped out at me when i unpacked it was that the 
directory structure i get was definitley not what i was expecting.  As a 
committer familiar with how the tree works, i understand what these 
directories are, but i suspect this may confuse folks who have downloaded 
solr in the past, and those that haven't are going to be double confused.

At a minimum the fact that there is no top level README.txt file seems 
like a major oversight that we should definitely deal with.

It also seems like we should probably have a README.txt file in dev-tools 
(may not be needed if a top level README.txt explains it, but it still 
seems like a good idea)

looking in the "solr" directory, i look at it's README.txt file, and see 
the section "Files Included In Apache Solr Distributions" it refers to a 
lot of stuff that is not actaully included in the distribution (because 
this is the *src* only distribution).   it's one thing to assume people 
downloading the *src* package will understand why the war nad jar aren't 
in solr/dist/ but we also refer to a "docs" directory that doesn't exist 
until they build it.

If we really want to ship a Solr src only package (which i think it's a 
great idea) we need to figure out a way to rework/structure this 
README.txt file to make sense for both cases (or ship two differnet ones 
-- but that seems like a bad idea) ... we could start by moving the 
"Instructions for Building Apache Solr from Source" up and rewording the 
"Files Included" section ... but even before that i notice we refer to 
docs/tutorial.html in the "Getting Started" section

looking only at the src release, i couldn't even find any documentation on 
how to build the tutorial and other files -- again: since this is a src 
only release, it may be ok to only include the "src" of the tutorial, but 
we should at least mention somewhere how to build it.

After running "ant test" from the top level, and "ant example" in the solr 
dir, i had a working example dir that seemed to be functioning fine, but i 
still had no javadocs for solr.  when i tried to run "ant javadoc-all" in 
the solr directory, javadoc actaully failed with a DocletAbortException 
stack trace because of a FileNotFoundException when trying to copy 
./build/docs/api/prettify/stylesheet+prettify.css (i have a 
./build/docs/api/prettify/ but it's empty ... not sure what target was 
expected to populate it)

switching tracks, i looked at solr's CHANGES.txt I noticed a few small 

* under the "Versions of Major Components" we list "Apache Lucene trunk"
* below the 3.1 changes is a list of 1.4.0 changes -- but no 1.4.1 
* we had discussed on the dev list that we should stop including changes 
from older "major" versions (ie: 3.x CHANGES.txt would only list things 
from 3.0 on) ... i thought i remembered people agreeing that was agood 
idea, but maybe i was imaginging that.

As i mentioned -- most of the issues were really about documentation and 
expectation setting for a "src" release.  the README and ant task 
descriptions we have were never really written with that idea in mind.

Next I looked at apache-solr-3.1.0.tgz

Other then the minor CHANGES.txt stuff mentioned above, this package made 
much more sense when i first unpacked it.  There is a top level 
README.txt, and most things mentioned in the README.txt file seemed to 
exist -- dist and docs.  Looking at the tutorial, i noticed the first 
glitch: it lists the Solr version as "" (which i 
know means it wasn't regenerated with the forrest properties set by ant).

Other then that, the tutorial looks good, the example seems to work, and 
the link from the tutorial to the javadocs works and i can browse them 
just fine.

Then I realized there was no "src" directory.  i'm not sure if this was 
intentional (don't all of our releases need to include the source, even 
the binary ones?) but at the very least we have a problem with the 
README.txt which says the src should be there and that you should be able 
to rebuild it with and (except that we also don't have a build.xml file 
even if we had the source)

This binary distribution also seems to be more redundent then it needs to 
be with the jars, everything in dist/ seems to be duplicated in dist/maven 
-- again, maybe this was intentional, but if so why? ... if i wanted the 
maven jars, wouldn't have just downloaded them from maven instead of 
downloading the binary release? is there a value add to including them in 
both directory structures?

(at this point i also discovered that there are "sources" jars in the 
maven directory, so we are in fact including the source i nthe binary 
releasees -- but if this is intentionally how we want to include them (and 
the maven jars aren't a mistake) then we should note that in the 

The number of "lucene" jars included in the release is also odd -- they 
are embedded in the solr.war obviously, but not included anywhere else.  
so people wanting to do something like use apache-solr-core-3.1.0.jar to 
embed solr in their app still need to get the lucene jars from a distinct 
release ... except that there does seem to be 3 lucene jars included in 
./contrib/analysis-extras/lucene-libs (i suspect this was a mistake in an 
intentional exclusion of those jars)

Moving on to lucene-3.1.0-src.tar.gz

Similar README.txt problem as solr ... jars, wars, and javadocs are 
mentioned that don't exist in the src release, and there is not verbage 
mentioning that these files won't exist until you build them.

Similar question about CHANGES.txt still listing changes in old releases 
... thought we wnated to prune those down.

In general, i'm wondering if there is really any value anymore in having 
distinct src artifacts for "lucene-java" and solr ... my gut inclination 
is that while it certainly makes sense to have distinct binary releases 
(people using just the core library really don't care about hte baggage of 
a fully solr release) for the source releases, we should just have a 
single package containing a full checkout of hte top level build dir, so 
people get all the source (and all the top level dev-tools) ... assuming 
we have a top level README.txt that is about building *all* of the source, 
then that would also help deal with some of hte other issues about the 
individual solr/README.txt and lucene/README.txt files refering ot things 
that aren't there until you build them.   ... but agian, i assume there 
was some discussion about this that i missed, and there is agood reason 
for maintaining distinct src packages.

Lastly: lucene-3.1.0.tar.gz

Similar to my question about the solr binary release: aren't we required 
to include source in all binary releases?  in this case there aren't even 
any maven jars in this packaging (which makes me doublely question the 
maven jars in the solr package) which means we don't even have the source 
in jar form.

if exclusion of the src is intentional here, then we should probably 
exclude the BUILD.txt file as well.

one other thing that jumps out at me is that the README.txt refers to 
contrib/demo/luceneweb.war but even in this release that doesn't exist, 
just contrib/demo/lucene-demo-3.1.0.jar  (and looking back at the source 
release i think that's just blatently out of date -- it looks like the war 
form of hte demo was completley removed from the build.xml)

this got me looking at docs/demo.html to see if it still refers to the 
demo war (it doesn't) where i notice the "Setting your CLASSPATH" 
instructions suggest that lucene-demo-{version}.jar will be found in the 
main direcotry along with lucene-core-{version}.jar ... we should update 
that to be contrib/demo/lucene-demo-{version}.jar

other then that -- things look good.  jars all seem to be there, manifests 
look good, and the demo runs correctly

What next from here?

I'd like to try and help fix some of this, but except for some of hte 
really trivial stuff (version numbers in solr CHANGES.txt and tutorial;
demo war refs in lucene README) i'm not really sure how to proceed -- 
it really depends on the overall strategy of the "src" only packages (was 
there a thread or jira on this? ... i really can't find any discussion 
about it) to decide what to do about the README.txt files and other 


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message