lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From dar...@ontrenet.com
Subject RE: seemingly impossible query
Date Thu, 20 May 2010 16:55:26 GMT
The problem here, I think, is that you only want 1 of many _results_ for a
particular ID. How would Solr know which result you want to keep? And
which to throw away?

However...

You can do this in two queries if you want. Have a separate solr document
with unique ID equal to the listOfIds value as they are indexed (one for
each unique id then).

On that _id document_ store a field pointing to the ID of the real
document you want as they are indexed.

Each time the _id document_ is rewritten with a <document> id, it
overwrites any prior data for that unique _id document_.

Now, you first query the _id document_ using the 100 id's you receive.
Each has a reference to a _single_ real document. Then you retrieve the
<document> field of each of those to write a single query to get all the
"last indexed" real documents for those id's.

It would work.

> Yeah I need something like:
> (id:1 and maxhits:1) OR (id:2 and maxits:1).. something crazy like that..
>
> I'm not sure how I can hit solr once. If I do try and do them all in one
> big OR query then I'm probably not going to get a hit for each ID. I would
> need to request probably 1000 documents to find all 100 and even then
> there's no guarantee and no way of knowing how deep to go.
>
> -Kallin Nagelberg
>
> -----Original Message-----
> From: darren@ontrenet.com [mailto:darren@ontrenet.com]
> Sent: Thursday, May 20, 2010 12:27 PM
> To: solr-user@lucene.apache.org
> Subject: RE: seemingly impossible query
>
> I see. Well, now you're asking Solr to ignore its prime directive of
> returning hits that match a query. Hehe.
>
> I'm not sure if Solr has a "unique" attribute.
>
> But this sounds, to me, like you will have to filter the results yourself.
> But at least you hit Solr only once before doing so.
>
> Good luck!
>
>> Thanks Darren,
>>
>> The problem with that is that it may not return one document per id,
>> which
>> is what I need.  IE, I could give 100 ids in that OR query and retrieve
>> 100 documents, all containing just 1 of the IDs.
>>
>> -Kallin Nagelberg
>>
>> -----Original Message-----
>> From: darren@ontrenet.com [mailto:darren@ontrenet.com]
>> Sent: Thursday, May 20, 2010 12:21 PM
>> To: solr-user@lucene.apache.org
>> Subject: Re: seemingly impossible query
>>
>> Ok. I think I understand. What's impossible about this?
>>
>> If you have a single field name called <id> that is multivalued
>> then you can retrieved the documents with something like:
>>
>> id:1 OR id:2 OR id:56 ... id:100
>>
>> then add limit 100.
>>
>> There's probably a more succinct way to do this, but I'll leave that to
>> the experts.
>>
>> If you also only want the documents within a certain time, then you also
>> create a <time> field and use a conjunction (id:0 ...) AND time:NOW-1H
>> or something similar to this. Check the query syntax wiki for specifics.
>>
>> Darren
>>
>>
>>> Hey everyone,
>>>
>>> I've recently been given a requirement that is giving me some trouble.
>>> I
>>> need to retrieve up to 100 documents, but I can't see a way to do it
>>> without making 100 different queries.
>>>
>>> My schema has a multi-valued field like 'listOfIds'. Each document has
>>> between 0 and N of these ids associated to them.
>>>
>>> My input is up to 100 of these ids at random, and I need to retrieve
>>> the
>>> most recent document for each id (N Ids as input, N docs returned). I'm
>>> currently planning on doing a single query for each id, requesting 1
>>> row,
>>> and caching the result. This could work OK since some of these ids
>>> should
>>> repeat quite often. Of course I would prefer to find a way to do this
>>> in
>>> Solr, but I'm not sure it's capable.
>>>
>>> Any ideas?
>>>
>>> Thanks,
>>> -Kallin Nagelberg
>>>
>>
>>
>
>


Mime
View raw message