lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eduardo Manrique <edua...@s1mbi0se.com.br>
Subject Re: MUST operator across grouped documents
Date Wed, 20 May 2015 21:10:02 GMT
I was searching a solution for this cross doc AND and found this:

TwoPhaseIterator:
Description
Currently some scorers have to do a lot of per-document work to determine if a document is
a match. The simplest example is a phrase scorer, but there are others (spans, sloppy phrase,
geospatial, etc).
Imagine a conjunction with two MUST clauses, one that is a term that matches all odd documents,
another that is a phrase matching all even documents. Today this conjunction will be very
expensive, because the zig-zag intersection is reading a ton of useless positions.
The same problem happens with filteredQuery and anything else that acts like a conjunction.

——

But I couldn’t figure out how it works. I didn’t find any examples and the docs doesn’t
show much. Do you guys know if it can help on my case? If so, where can I find a example showing
how to use it?

Thanks,
Eduardo Manrique

> On May 19, 2015, at 3:21 PM, Eduardo Manrique <eduardo@s1mbi0se.com.br> wrote:
> 
> I think it is not possible. There is another problem putting everything in one document,
we need also filter by the number of times a user accessed a url in a period of time. The
doc size would be too big. There are other informations we gather too. 
> Do you think it is possible to solve this with a custom query? 
> 
>> On May 19, 2015, at 2:57 PM, Erick Erickson <erickerickson@gmail.com> wrote:
>> 
>> Eduardo:
>> 
>> Just noticed that we got off the user's list, so replying to the
>> user's list to move the conversation back there.
>> 
>> In the case you outlined, I don't think there's much choice except to
>> index the URLs in a multiValued field. Otherwise the use-case of
>> asking for a user that's visited URL1 AND URL2 doesn't really work.
>> This has the downside that each time you add a URL to the sites a user
>> has visited, you have to re-index the entire document.
>> 
>> What update rate to you expect? And how hard is it to pull all the
>> URLs visited by a user out?
>> 
>> You can also use "Atomic updates" to add a URL to an existing field,
>> that requires that you satisfy some special conditions, mainly that
>> all the original fields must have 'stored="true" '.
>> 
>> Best,
>> Erick
>> 
>> On Mon, May 18, 2015 at 4:33 PM, Eduardo Manrique
>> <eduardo@s1mbi0se.com.br> wrote:
>>> Hi,
>>> 
>>> I need to search using group join with AND through fields in different documents.
>>> For example I might have the documents:
>>>       doc1: field1=a, parentId=1
>>>       doc2: field2=b, parentId=1
>>> 
>>> What a need is to make a join using parentId and a search like:
>>>       field1 = a AND field2 = b grouping by parentId
>>> 
>>> I this case I should get the group with parentId 1.
>>> I noticed that BooleanQuery with MUST will only work if field1 and field2 where
in the same document. Is it possible to do that?
>>> 
>>> Obs: I working with lucene 5.1
>>> 
>>> Thanks,
>>> Eduardo Manrique
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>> 
> 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message