lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Beyer,Nathan" <NBE...@CERNER.COM>
Subject thoughts/suggestions for analyzing/tokenizing class names
Date Sat, 15 Dec 2007 23:14:50 GMT
I have a few fields that use package names and class names and I've been
looking for some suggestions for analyzing these fields.

A few examples -

Text (class name)
- "org.apache.lucene.document.Document"
Queries that would match 
- "org.apache" , "org.apache.lucene.document"

Text (class name + method signature)
-- "org.apache.lucene.document.Document#add(Fieldable)"
Queries that would match
-- "org.apache.lucene", "org.apache.lucene.document.Document#add"

Any thoughts on how to approach tokenizing these types of texts?


CONFIDENTIALITY NOTICE This message and any included attachments are from Cerner Corporation
and are intended only for the addressee. The information contained in this message is confidential
and may constitute inside or non-public information under international, federal, or state
securities laws. Unauthorized forwarding, printing, copying, distribution, or use of such
information is strictly prohibited and may be unlawful. If you are not the addressee, please
promptly delete this message and notify the sender of the delivery error by e-mail or you
may call Cerner's corporate offices in Kansas City, Missouri, U.S.A at (+1) (816)221-1024.
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message