lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
Subject Re: Search Expansion - one step closer ... !
Date Mon, 05 Apr 2004 11:03:19 GMT
Hi Eri<b>k</b> ;-)

Thanks for your quick reply. Basically I am using the
XML indexing example found on the web which first
parses an XML file (I have XML files) and then uses
>From the XMLDocumentHandlerSax source I can see that it
is using 'text' fields which is fine for me because I
have many small XML files that do not contain too much
text and I would prefer to have all XML tags indexed as
well as stored for hit highlighting purposes.

Using StandardAnalyser is fine for my domain vocabulary
and I used my 'old' code with QueryParser and
experiments confirmed that indeed searches for "host
defense", "host-defense", "host_defense", "host
Defense" etc... all find "host defense" which is as it
stands in the XML.

So the only thing for me to do now is obviously to
apply the StandardAnalyser to the boolean query
building. I looked into your demo where you compared
different analysers and created a TokenStream through
But there is an error in my query.add() - expecting
String not Stream. I know it's a pain with these stupid
guys but ... any suitable code snippet ?

I attach my code below.

When will your book be published ?

Thanks again,


public class SearchFiles1D {
  String[] lucene_out; 
  TokenStream stream;
    public String[] doSearchBQ(String index_path,
String[] myquery){
    // does query processing without QueryParser but by
contructing a boolean query	
    try {
      Searcher searcher = new IndexSearcher(index_path);
      Analyzer analyzer = new StandardAnalyzer();
	BooleanQuery query = new BooleanQuery();
	//for each term to add:
	for (int j=0; j<myquery.length; j++){
	stream = analyzer.tokenStream("contents", new
	query.add(new TermQuery(new Term("subject", stream)),
false, false);
	Hits hits =;	
	lucene_out = new String[hits.length()];	
	for (int i = 0; i < hits.length(); i ++)
	    Document doc = hits.doc(i);
	    String name = doc.get("filename");
	    lucene_out[i] = name + "|" + doc.get("subject") +
"|" + doc.get("message");

    } catch (Exception e) {
      System.out.println(" caught a " + e.getClass() +
			 "\n with message: " + e.getMessage());
    return lucene_out;

The ALL NEW CS2000 from CompuServe
 Better!  Faster! More Powerful!
 250 FREE hours! Sign-on Now!

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message