lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Peronalized Search Results or Matching Documents to Users
Date Sat, 01 Aug 2015 13:02:36 GMT
How soon? It's pretty much done AFAIK, but the folks trying to work on
it have had their priorities re-arranged.

So I really don't have a date.

Erick

On Fri, Jul 31, 2015 at 4:59 PM, Upayavira <uv@odoko.co.uk> wrote:
> How soon? And will you be able to use them for querying, or just
> faceting/sorting/displaying?
>
> Thx!
>
> Upayavira
>
> On Fri, Jul 31, 2015, at 09:27 PM, Erick Erickson wrote:
>> And coming soon will be docvalues field updates that don't require
>> reindexing the whole doc.
>>
>> Best,
>> Erick
>> On Jul 31, 2015 6:51 AM, "Upayavira" <uv@odoko.co.uk> wrote:
>>
>> > On Thu, Jul 30, 2015, at 07:29 PM, Shawn Heisey wrote:
>> > > On 7/30/2015 10:46 AM, Robert Farrior wrote:
>> > > > We have a requirement to be able to have a master product catalog
and
>> > to
>> > > > create a sub-catalog of products per user. This means I may have 10,000
>> > > > users who each create their own list of documents. This is a simple
>> > mapping
>> > > > of user to documents. The full data about the documents would be in
>> > the main
>> > > > catalog.
>> > > >
>> > > > What approaches would allow Solr to only return the results that are
>> > in the
>> > > > user's list?  It seems like I would need a couple of steps in the
>> > process.
>> > > > In other words, the main catalog has 3 documents: A, B and C. I have
2
>> > > > users. User 1 has access to documents A and C but not B. User 2 has
>> > access
>> > > > to documents C and B but not A.
>> > > >
>> > > > When a user searches, I want to only return documents that the user
has
>> > > > access to.
>> > >
>> > > A common approach for Solr would be to have a multivalued "user" field
>> > > on each document, which has individual values for each user that can
>> > > access the document.  When you index the document, you included values
>> > > in this field listing all the users that can access that document.
>> > >
>> > > Then you simply filter by user:
>> > >
>> > > fq=user:joe
>> > >
>> > > This is EXTREMELY efficient at query time, especially when the number of
>> > > users is much smaller than the number of documents.  It may complicate
>> > > indexing somewhat, but indexing is an extremely custom operation that
>> > > users have to write themselves, so it probably won't be horrible.
>> >
>> > Things to consider:
>> >
>> >  * How often are documents assigned to new users?
>> >  * How many documents does a user typically have?
>> >  * Do you have a 'trigger' in your app that tells you a user has been
>> >  assigned
>> >    a new doc?
>> >
>> > You can use a pseudo join to implement this sort of thing - have a
>> > different core that contains the 'permissions', either a document that
>> > says "this document ID is accessible via these users" or "this user is
>> > allowed to see these document IDs". You are keeping your fast moving
>> > (authorization) data separate from your slow moving (the docs
>> > themselves) data.
>> >
>> > You can then say "find me all documents that are accessible via user X"
>> >
>> > Upayavira
>> >

Mime
View raw message