tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick Burch <nick.bu...@alfresco.com>
Subject Extracting dublin core metadata in HtmlParser?
Date Tue, 19 Jan 2010 13:41:45 GMT
Hi All

I've been taking a look at the HtmlParser, and I can't spot anything in 
there that extracts any of the dublin core metadata that could be there. 
It seems that it's only things like location and encoding that get set 
onto the metadata object. Nothing like description, author etc seems to 
get set.

So, two questions: is that feature actually all ready there and I've just 
been useless at finding it? And if not, do people think it's a 
sufficiently useful feature that I should go and write a patch for it?


View raw message