lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vauthrin, Laurent" <Laurent.Vauth...@disney.com>
Subject RE: SolrPlugin Guidance
Date Fri, 11 Dec 2009 17:08:30 GMT
It looks like the SolrQueryParser constructor accepts an analyzer as a
parameter.  That seems to do the trick.  Although feel free to respond
anyway if you have a comment on the approach :)

-----Original Message-----
From:
solr-user-return-30215-Laurent.Vauthrin=disney.com@lucene.apache.org
[mailto:solr-user-return-30215-Laurent.Vauthrin=disney.com@lucene.apache
.org] On Behalf Of Vauthrin, Laurent
Sent: Thursday, December 10, 2009 11:44 AM
To: solr-user@lucene.apache.org
Subject: RE: SolrPlugin Guidance

Ok, looks like I may not be taking the right approach here.  I'm running
a problem.

Let's say a user is looking for all files in any directory 'foo' with a
directory description 'bar' 

q:+directory_name:foo +directory_description:bar

Our QParser plugin will perform queries against directory documents and
return any file document that has the matching directory id(s).  So the
plugin transforms the query to something like 

q:+(directory_id:4 directory:10) +directory_id:(4)

Note: directory_id is only in file documents.  The query above assumes
that two directories had the name 'foo' but only one had the description
'bar'

Currently the parser plugin is doing the lookup queries via the standard
request handler.  The problem with this approach is that the look up
queries are going to be analyzed twice.  This only seems to be a problem
because we're using stemming.  For example, stemming 'franchise' gives
'franchis' and stemming it again gives 'franchi'.  The second stemming
will cause the query not to match anymore.

So basically my questions are:
1. Should I not be passing my lookup queries back to the request
handler, but instead to some lower level component?  If so, which
component would be good to look at?
2. Is there a way to tell the SolrQueryParser not to analyze or a
different way to run the query so that they query analysis won't happen?

Thanks again,
Laurent Vauthrin

-----Original Message-----
From:
solr-user-return-30170-Laurent.Vauthrin=disney.com@lucene.apache.org
[mailto:solr-user-return-30170-Laurent.Vauthrin=disney.com@lucene.apache
.org] On Behalf Of Vauthrin, Laurent
Sent: Wednesday, December 09, 2009 2:53 PM
To: solr-user@lucene.apache.org
Subject: RE: SolrPlugin Guidance

Thanks for the response.  I went ahead and gave it a shot.  In my case,
the directory name may not be unique so if I get multiple ids back then
I create a BooleanQuery (Occur.SHOULD) to substitute the directory name
query.  This seems to work at the moment so hopefully that's the right
approach. 

Thanks,
Laurent Vauthrin


-----Original Message-----
From:
solr-user-return-30054-Laurent.Vauthrin=disney.com@lucene.apache.org
[mailto:solr-user-return-30054-Laurent.Vauthrin=disney.com@lucene.apache
.org] On Behalf Of Chris Hostetter
Sent: Monday, December 07, 2009 12:17 PM
To: solr-user@lucene.apache.org
Subject: RE: SolrPlugin Guidance


: e.g. For the following query that looks for a file in a directory:
: q=+directory_name:"myDirectory" +file_name:"myFile"
: 
: We'd need to decompose the query into the following two queries:
: 1. q=+directory_name:"myDirectory"&fl=directory_id
: 2. q=+file_name:"myFile" +directory_id:(results from query #1)
: 
: I guess I'm looking for the following feedback:
: - Does this sound crazy?  

it's a little crazy, but not absurd.

: - Is the QParser the right place for this logic?  If so, can I get a 
: little more guidance on how to decompose the queries there (filter 
: queries maybe)?

a QParser could work. (and in general, if you can solve something with a

QParser that's probably for the best, since it allows the most reuse).
but 
exactly how to do it depends on how many results you expect from your 
first query:  if you are going to structure things so they have to 
uniquely id a directory, and you'll have a singleID, then this is 
something that could easily make sense in a QParser (you are essentailly

just rewriting part of the query from string to id -- you just happen to

be using solr as a lookup table for those strings).

but if you plan to support any arbitrary "N" directories, then you may 
need something more complicated ... straight filter queries won't help 
much because you'll want the union instead of hte intersection, so for 
every directoryId you find, use it as a query to get a DocSet and then 
maintain a running union of all those DocSets to use as your final
filter 
(hmm... that may not actually be possible with the QParser API ... i 
haven't look at ti in a while, but for an approach like this you may
beed 
to subclass QueryComponent instead)




-Hoss


Mime
View raw message