lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mikhail Khludnev <mkhlud...@griddynamics.com>
Subject Re: Peronalized Search Results or Matching Documents to Users
Date Sat, 01 Aug 2015 20:30:43 GMT
On Sat, Aug 1, 2015 at 9:45 PM, Upayavira <uv@odoko.co.uk> wrote:

> ticket?
>
https://issues.apache.org/jira/browse/SOLR-5944


>
> On Sat, Aug 1, 2015, at 02:02 PM, Erick Erickson wrote:
> > How soon? It's pretty much done AFAIK, but the folks trying to work on
> > it have had their priorities re-arranged.
> >
> > So I really don't have a date.
> >
> > Erick
> >
> > On Fri, Jul 31, 2015 at 4:59 PM, Upayavira <uv@odoko.co.uk> wrote:
> > > How soon? And will you be able to use them for querying, or just
> > > faceting/sorting/displaying?
> > >
> > > Thx!
> > >
> > > Upayavira
> > >
> > > On Fri, Jul 31, 2015, at 09:27 PM, Erick Erickson wrote:
> > >> And coming soon will be docvalues field updates that don't require
> > >> reindexing the whole doc.
> > >>
> > >> Best,
> > >> Erick
> > >> On Jul 31, 2015 6:51 AM, "Upayavira" <uv@odoko.co.uk> wrote:
> > >>
> > >> > On Thu, Jul 30, 2015, at 07:29 PM, Shawn Heisey wrote:
> > >> > > On 7/30/2015 10:46 AM, Robert Farrior wrote:
> > >> > > > We have a requirement to be able to have a master product
> catalog and
> > >> > to
> > >> > > > create a sub-catalog of products per user. This means I
may
> have 10,000
> > >> > > > users who each create their own list of documents. This
is a
> simple
> > >> > mapping
> > >> > > > of user to documents. The full data about the documents
would
> be in
> > >> > the main
> > >> > > > catalog.
> > >> > > >
> > >> > > > What approaches would allow Solr to only return the results
> that are
> > >> > in the
> > >> > > > user's list?  It seems like I would need a couple of steps
in
> the
> > >> > process.
> > >> > > > In other words, the main catalog has 3 documents: A, B and
C. I
> have 2
> > >> > > > users. User 1 has access to documents A and C but not B.
User 2
> has
> > >> > access
> > >> > > > to documents C and B but not A.
> > >> > > >
> > >> > > > When a user searches, I want to only return documents that
the
> user has
> > >> > > > access to.
> > >> > >
> > >> > > A common approach for Solr would be to have a multivalued "user"
> field
> > >> > > on each document, which has individual values for each user that
> can
> > >> > > access the document.  When you index the document, you included
> values
> > >> > > in this field listing all the users that can access that document.
> > >> > >
> > >> > > Then you simply filter by user:
> > >> > >
> > >> > > fq=user:joe
> > >> > >
> > >> > > This is EXTREMELY efficient at query time, especially when the
> number of
> > >> > > users is much smaller than the number of documents.  It may
> complicate
> > >> > > indexing somewhat, but indexing is an extremely custom operation
> that
> > >> > > users have to write themselves, so it probably won't be horrible.
> > >> >
> > >> > Things to consider:
> > >> >
> > >> >  * How often are documents assigned to new users?
> > >> >  * How many documents does a user typically have?
> > >> >  * Do you have a 'trigger' in your app that tells you a user has
> been
> > >> >  assigned
> > >> >    a new doc?
> > >> >
> > >> > You can use a pseudo join to implement this sort of thing - have a
> > >> > different core that contains the 'permissions', either a document
> that
> > >> > says "this document ID is accessible via these users" or "this user
> is
> > >> > allowed to see these document IDs". You are keeping your fast moving
> > >> > (authorization) data separate from your slow moving (the docs
> > >> > themselves) data.
> > >> >
> > >> > You can then say "find me all documents that are accessible via
> user X"
> > >> >
> > >> > Upayavira
> > >> >
>



-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics

<http://www.griddynamics.com>
<mkhludnev@griddynamics.com>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message