lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mark harwood <>
Subject Query.extractTerms - a poor introspection API?
Date Thu, 06 Apr 2006 09:52:13 GMT
Having switched the highlighter over from lots of
Query-specific code to using the generic
Query.extractTerms API I realize I have both gained
something (support for all query types) and lost
something (detailed boost info for each term in the
tree eg Fuzzy spelling variants). The boost info was
useful for selecting snippets and grading highlight

This exercise has led me to the conclusion that
extractTerms is not the greatest way to provide
information about queries.

I see a clear analogy with the way exceptions are/were
implemented in Java - there used to be no standard way
of unravelling nested exceptions and this was solved
in JDK1.4 by adding a "getCause()" method to
exceptions to allow progressive unravelling of all
exception types.

Unfortunately, Query.extractTerms(Set) is a bit like
solving the Java nested exceptions problem by
providing a method like
Throwable.getMessageStrings(Set) - it only gives part
of the information about the tree elements (ie no
boosts info) and provides no indication of the nested
Maybe we should have as a standard part of Query:

  //immediate child queries only
  Query [] getNestedQueries();

  //immediate terms only
  Term [] getTerms();

A generic highlighter implementation could then:

a) work with any query type
b) more accurately assess the score contribution each
term provides based on it's position in the stack and
the boosts applied to each parent query on that branch

This doesn't seem a particularly onerous API to
implement and a more feature-rich Query introspection
API may well enable other applications such as Query


Yahoo! Messenger - NEW crystal clear PC to PC calling worldwide with voicemail

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message