lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ortelli, Gian Luca" <>
Subject Extracting contact data
Date Wed, 13 Jan 2010 16:39:40 GMT
Hi community,


I have a general understanding of Lucene concepts, and I'm wondering if
it's the right tool for my job:


- I need to extract data like e.g. time intervals ("8am - 12pm"), street
addresses from a set of files. The common issue with this data unit is
that they contain spaces and are not always definable through regexes.


- the extraction must take into consideration the "proximity": for
example, a mail address which is close to the work "Contacts" will
receive a higher rank, since I'm looking for contact data.


Do you think I can get any advantage from building a solution on Lucene?



  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message