lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steven A Rowe" <>
Subject RE: formatable changes log
Date Sat, 26 Jan 2008 04:36:00 GMT
On 01/25/2008 at 2:05 PM, Chris Hostetter wrote:
> > As it is becoming hard to browse/navigate CHANGES.txt, how about
> > maintaining it in a simple HTML file?
> personally, i'm a fan of simple, plain text files for the
> CHANGES.txt ... easy to edit, easy to read.

I don't know about easy to read (more than one page per section makes it hard to know where
you are), but easy to edit, sure.

> (even better in my mind would be if we could keep editing in
> plain text, and had some handy scripts to reformat into HTML

I was thinking the same thing, and I've done just that, stealing the folding Javascript verbatim
from Doron's original:


I added in auto-linkification of JIRA and Bugzilla issues.  IMHO, working links to issues
is the killer feature for an HTML version of CHANGES.txt.

Here's the Perl script I wrote to produce the above:


However, I noticed a problem: in CHANGES.txt under the 2.3.0 release in the "Bug fixes" section,
there is a gap in the sequence:

  17. LUCENE-1010: Fixed corruption case when document with no term
      vector fields is added after documents with term vector fields.
      This case is hit during merge and would cause an EOFException.
      This bug was introduced with LUCENE-984.  (Andi Vajda via Mike

  19. LUCENE-1009: Fix merge slowdown with LogByteSizeMergePolicy when
      autoCommit=false and documents are using stored fields and/or term
      vectors.  (Mark Miller via Mike McCandless)

But my script only notices that it's a numbered list, not the specific numbers on each item,
and so re-numbers item #19 as #18, and then continues for all following items to be misaligned
with CHANGES.txt.  Should we preserve incorrect sequencing in the HTML format?

On 01/25/2008 at 7:01 AM, DM Smith wrote:
> And it will solve a charset problem I'm seeing in the file.
> Under Testing for 2.3.0, there is an accented character that
> looks like it is encoded in UTF-8 but it is coming across as
> multi-character.

I added a <META> tag in the <head> tag to set the charset to UTF-8; looks like
it did the trick.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message