tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jukka Zitting" <jukka.zitt...@gmail.com>
Subject Re: How is WriteOutContentHandler supposed to work?
Date Tue, 20 Nov 2007 13:27:54 GMT

On Nov 20, 2007 3:14 PM, Niall Pemberton <niall.pemberton@gmail.com> wrote:
> OK thanks - is the document's title supposed to be written then? If it
> is then why not the rest of the meta data?

Now that you raised the issue, I think it was wrong for me to make
XHTMLContentHandler output the title as a <h1/> element within the
XHTML body. The title as well as other document metadata should go to
the XHTML head section.

> Also theres no separation between the title and content start - which looks like a bug.

You're right, that's a bug.


Jukka Zitting

View raw message