lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <>
Subject Re: Extracting Dates
Date Fri, 03 Oct 2008 16:49:57 GMT
David, this is not really a Lucene issue.
Here is some Perl code that you could either use or rewrite in Java if you need it in Java:

Tika won't help with this, and I believe UIMA itself with not help either, although there
may be components for date extraction that plug into UIMA.

Sematext -- -- Lucene - Solr - Nutch

----- Original Message ----
> From: David Lee <>
> To:
> Sent: Thursday, October 2, 2008 7:18:22 PM
> Subject: Extracting Dates
> What should I use if I want to try to extract events (dates/times) out of an
> HTML page? I looked at Tika since it's a parsing project. Am I on the right
> track or is there something better to use? It also seems like Apache UIMA is
> kind of doing that, but I'm not sure. I thought since a lot of these
> projects are associated to lucene, someone might know.
> David Lee

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message