lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mark harwood <markharw...@yahoo.co.uk>
Subject Re: "Advanced" query language
Date Thu, 22 Dec 2005 15:26:06 GMT
Hi Chris,
Thanks for taking the time to review this.
> 1) I aplaud the plugable nature of your solution.

That's definitely a worthwhile objective.

> 2) Digging into what was involved in writting an
> ObjectBuilder, I found...
> don't really feel like
> the API has a very clean seperation from SAX.


True. The efforts to remove state management were
entirely around the hand-off between one ObjectBuilder
and any "child" Object Builders - ie thinking of the
processing chain like the cartoon where one big fish
is about to eat a smaller fish, which is about to eat
a smaller fish which.... etc. The parser handles the
stack of individual ObjectBuilders and their
consumption thus relieving one level of SAX-based
state management that a "just one big fish" approach
to parsing would take. However within each individual
ObjectBuilder they have responsibility for handling
SAX apis to configure themselves. My assumption was
SAX would be a familiar API but I guess that may be
wrong.

> I think that the ideal API wouldn't require people
> writing ObjectBuilders
> to know anything about sax, or to ever need to
> import anything from
> org.xml.** or javax.xml.**

Fair enough. I presume we want to maintain the
position that Lucene should not have any dependencies
other than JDK1.4?
I did look at Commons Digester but that seemed to want
to suck in logging, beanutils etc so embarked on my
own lightweight SAX-based implementation.

> 
> 3) While the *need* to maintaing/pass state
> information should be avoided.
> I can definitely think of uses for this framework
> that may *want* to pass
> state information -- both down to the ObjectBuilders
> that get used in
> inner nodes, as well as up to wrapping nodes, and
> there doesn't seem to be
> an easy way to that. 

State "passed down" is something I saw as a potential
addition to the "Parser" object shared by all
ObjectBuilders eg a Map that was associated with stack
level.

> The best example i can give is if someone (ie: me)
> wanted to use this
> framework to allow boolean queries to be written
> like this...
> 
>    <BooleanQuery>
>       <TermQuery occurs="mustNot" field="contents"
> value="mustNot"/>
>       <UserInput occurs="must">"a phrase"
> fuzzy~</UserInput>
>    </BooleanQuery

I did consider that my version of BooleanQuery could
be written slightly more succinctly as:

<BooleanQuery>
       <MustNot><TermQuery field="contents"
value="foo"/>
</MustNot>
>       <UserInput occurs="must">"a phrase"
> fuzzy~</UserInput>
>    </BooleanQuery



> 
> ...i want to be able to write an
> "BooleanClauseWrapperObjectBuilder" that
> can be wrapped around any other ObjectBuilder and
> will return whatever
> object it does, but will also check for and "occurs"
> attribute, and put
> that in a state bucket somewhere that the
> BooleanQuery has access to it
> when adding the Query it gets back.
> 
> Going the ooposite direction, I'd like to be able to
> have tags that set
> state which is accesible to descendent tags (even if
> hte tags in teh
> middle don't know anything about that bit of state. 
> for example:
> specifying how much slop should be used by default
> in phrase queries...
> 
>    <StateModifier defaultPhraseSlop="100">
>       ...
>       <BooleanQuery>
>          <PhraseQuery occurs="mustNot"
> field="contents">
>             How Now Brown Cow?
>          </PhraseQuery>
>          ...
>       </BooleanQuery
>    <StateModifier defaultPhraseSlop="100">
> 
> 
> I haven't had a chance to try implimenting this, but
> at a high level, it
> seems like all of this should be possible and still
> easy to use.
> Here's a real rough cut at what i've had floating
> arround in the back
> of my head (I'm doing this straight into email,
> pardon any typo's or
> psuedo code) ...
> 
> 
> 
> /** could be implimented with SAX, or DOM, or Pull
> */
> public interface LuceneXmlParser {
>     /** this method will call setParser(this) on
> each handler */
>     public void registerHandler(String tag,
> LuceneXmlHandler h);
>     /**
>      primary method for clients, parses the xml and
> calls processNode
>      on the root node
>      */
>     public Query parse(InputStream xml);
>     /**
>      dispatches to the appropriate handler's process
> method based
>      on the Node name, may be called by handlers for
> recursion of children
>      nodes
>      */
>     public Query processNode(LuceneXmlNode n, State
> s)
> }
> public interface LuceneXmlHandler {
>     public void setParser(LuceneXmlParser p)
>     /**
>      should return a Query that corrisponds to the
> specified node.
>      may rea/modify state in any way it wants ... it
> is recommended that
>      all implimenting methods wrap their state
> before passing it on when
>      processing children.
>      */
>     public Query process(LuceneXmlNode n, State s)
> }
> /**
>  A State is a stack frame that can delegate read
> operations to another
>  State it wraps (if there is one).  but it cannot
> delegate modifying
>  operations.
>  Classes implimenting State should provide a
> constructor that takes
>  another State to wrap.
> */
> public interface State extends Map<String,Object> {
>    /**
>     for callers that wnat to know what's in the
> immeidate stack
>     frame without any delegation
>     */
>    public Map<String,Object> getOuterFrame();
>    /* should return a new state that wraps the
> current state */
>    public State wrapCurrentState();
> }
> /** a very simple api arround the most basic xml
> concepts */
> public interface LuceneXmlNode {
>    public CharSequence getNodeName();
>    public Map<String,String> getAttributes()
>    public CharSequence getBodyText();
>    public Iterator<LuceneXmlNode> getChildren()
> }
> /** an example handler for TermQuery */
> public class BooleanQueryHandler impliments
> LuceneXmlHandler {
>    LuceneXmlParser p;
>    public void setParser(LuceneXmlParser q) { p=q; }
>    public Query process(LuceneXmlNode n, State s) {
>      Map<String,String> attrs = getAttributes()
>      return new TermQuery(new
> Term(attrs.get("field"),attrs.get("value"))
>    }
> }
> /** an example handler for BooleanQuery */
> public class BooleanQueryHandler impliments
> LuceneXmlHandler {
>    LuceneXmlParser p;
>    public void setParser(LuceneXmlParser q) { p=q; }
>    public Query process(LuceneXmlNode n, State s) {
>      BooleanQuery r = new BooleanQuery;
> 
=== message truncated ===



		
___________________________________________________________ 
To help you stay safe and secure online, we've developed the all new Yahoo! Security Centre.
http://uk.security.yahoo.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message