lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Smith <psm...@aconex.com>
Subject Re: "Advanced" query language
Date Thu, 22 Dec 2005 00:25:12 GMT
Hey all,

I haven't been paying real close attention to this thread, but if any  
of you are looking for something that has _easy_ Object->XML->Object  
you should seriously try XStream (http://xstream.codehaus.org)..   
Simplest/easiest api I've seen.  BSD licensed too (Apache friendly).   
One can register a Converter class to assist with anything the built- 
in converters don't handle well. The Convertor code is nice and elegant.

Just something to think about maybe?

cheers,

Paul
On 22/12/2005, at 11:20 AM, Chris Hostetter wrote:

>
> I finally got a chance to look at this code today (the best part  
> about the
> last day before vacation, is no one expects you to get anything  
> done, so
> you can ignore your "real work" and spend time on things that are more
> important in the long run) and while I still havne't wrapped my head
> arround all of it, I wanted to share my thoughts so far on the API...
>
> 1) I aplaud the plugable nature of your solution. Looking at the Test
> Case, it is easy to see exactly how a service provider could
> do things like override the behavior of a <PhraseQuery> to be  
> implimented
> as a SpanQuery without their clients being affected at all.  Kudos.
>
> 2) Digging into what was involved in writting an ObjectBuilder, I  
> found
> the api somewhat confusion.  I was reminded of this exchange you  
> had with
> Yonik...
>
> : > While SAX is fast, I've found callback interfaces
> : > more difficult to
> : > deal with while generating nested object graphs...
> : > it normally
> : > requires one to maintain state in stack(s).
> :
> : I've gone to some trouble to avoid the effects of this
> : on the programming model.
>
> As someone who feels very comfortable with Lucene, but has no
> practical experience with SAX, I have to say that I don't really  
> feel like
> the API has a very clean seperation from SAX.
>
> I think that the ideal API wouldn't require people writing  
> ObjectBuilders
> to know anything about sax, or to ever need to import anything from
> org.xml.** or javax.xml.**
>
>
> 3) While the *need* to maintaing/pass state information should be  
> avoided.
> I can definitely think of uses for this framework that may *want*  
> to pass
> state information -- both down to the ObjectBuilders that get used in
> inner nodes, as well as up to wrapping nodes, and there doesn't  
> seem to be
> an easy way to that.  (it could just be my lack of SAX knowledge  
> though)
>
> The best example i can give is if someone (ie: me) wanted to use this
> framework to allow boolean queries to be written like this...
>
>    <BooleanQuery>
>       <TermQuery occurs="mustNot" field="contents" value="mustNot"/>
>       <UserInput occurs="must">"a phrase" fuzzy~</UserInput>
>    </BooleanQuery
>
> ...i want to be able to write an  
> "BooleanClauseWrapperObjectBuilder" that
> can be wrapped around any other ObjectBuilder and will return whatever
> object it does, but will also check for and "occurs" attribute, and  
> put
> that in a state bucket somewhere that the BooleanQuery has access  
> to it
> when adding the Query it gets back.
>
> Going the ooposite direction, I'd like to be able to have tags that  
> set
> state which is accesible to descendent tags (even if hte tags in teh
> middle don't know anything about that bit of state.  for example:
> specifying how much slop should be used by default in phrase  
> queries...
>
>    <StateModifier defaultPhraseSlop="100">
>       ...
>       <BooleanQuery>
>          <PhraseQuery occurs="mustNot" field="contents">
>             How Now Brown Cow?
>          </PhraseQuery>
>          ...
>       </BooleanQuery
>    <StateModifier defaultPhraseSlop="100">
>
>
> I haven't had a chance to try implimenting this, but at a high  
> level, it
> seems like all of this should be possible and still easy to use.
> Here's a real rough cut at what i've had floating arround in the back
> of my head (I'm doing this straight into email, pardon any typo's or
> psuedo code) ...
>
>
>
> /** could be implimented with SAX, or DOM, or Pull */
> public interface LuceneXmlParser {
>     /** this method will call setParser(this) on each handler */
>     public void registerHandler(String tag, LuceneXmlHandler h);
>     /**
>      primary method for clients, parses the xml and calls processNode
>      on the root node
>      */
>     public Query parse(InputStream xml);
>     /**
>      dispatches to the appropriate handler's process method based
>      on the Node name, may be called by handlers for recursion of  
> children
>      nodes
>      */
>     public Query processNode(LuceneXmlNode n, State s)
> }
> public interface LuceneXmlHandler {
>     public void setParser(LuceneXmlParser p)
>     /**
>      should return a Query that corrisponds to the specified node.
>      may rea/modify state in any way it wants ... it is recommended  
> that
>      all implimenting methods wrap their state before passing it on  
> when
>      processing children.
>      */
>     public Query process(LuceneXmlNode n, State s)
> }
> /**
>  A State is a stack frame that can delegate read operations to another
>  State it wraps (if there is one).  but it cannot delegate modifying
>  operations.
>  Classes implimenting State should provide a constructor that takes
>  another State to wrap.
> */
> public interface State extends Map<String,Object> {
>    /**
>     for callers that wnat to know what's in the immeidate stack
>     frame without any delegation
>     */
>    public Map<String,Object> getOuterFrame();
>    /* should return a new state that wraps the current state */
>    public State wrapCurrentState();
> }
> /** a very simple api arround the most basic xml concepts */
> public interface LuceneXmlNode {
>    public CharSequence getNodeName();
>    public Map<String,String> getAttributes()
>    public CharSequence getBodyText();
>    public Iterator<LuceneXmlNode> getChildren()
> }
> /** an example handler for TermQuery */
> public class BooleanQueryHandler impliments LuceneXmlHandler {
>    LuceneXmlParser p;
>    public void setParser(LuceneXmlParser q) { p=q; }
>    public Query process(LuceneXmlNode n, State s) {
>      Map<String,String> attrs = getAttributes()
>      return new TermQuery(new Term(attrs.get("field"),attrs.get 
> ("value"))
>    }
> }
> /** an example handler for BooleanQuery */
> public class BooleanQueryHandler impliments LuceneXmlHandler {
>    LuceneXmlParser p;
>    public void setParser(LuceneXmlParser q) { p=q; }
>    public Query process(LuceneXmlNode n, State s) {
>      BooleanQuery r = new BooleanQuery;
>      Integer minShouldMatch = new Integer(n.getAttributes().get 
> ("minShouldMatch"));
>      r.setMinShouldMatch(minShouldMatch);
>      for (LuceneXmlNode kid : n.getChildren()) {
>         kidState = s.wrapCurrentState();
>         Query b = p.processNode(kid,kidState);
>         Occurs o = Occurs.MAY;
>         if (kidState.getOuterFrame().contains("occurs")) {
>             o = kidState.getOuterFrame().get();
>         }
>         r.add(b,o);
>      }
>      return r;
> }
> /**
>  an example handler that can make wrap any other handler and give it
>  BooleanClause.Occurs awareness
> */
> public class BooleanClauseWrapperHandler impliments LuceneXmlHandler {
>    LuceneXmlParser p;
>    LuceneXmlHandler inner;
>    public BooleanClauseWrapperHandler(LuceneXmlHandler i) { inner =  
> i; }
>    public void setParser(LuceneXmlParser q) { p=q; }
>    public Query process(LuceneXmlNode n, State s) {
>       Query q = i.process(n, s)
>       if (n.getAttributes().contains("occurs")) {
>         /* glossing over string parsing to object construction here */
>         s.put("occurs",n.getAttributes().get("occurs"));
>       }
>       return q;
>    }
> }
>
>
> ...does that make sense?
>
>
>
> -Hoss
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message