manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Sturge <peter.stu...@googlemail.com>
Subject Re: FW: Solr and LCF security at query time
Date Thu, 22 Apr 2010 08:27:18 GMT
Hi Karl,

Thanks for the quick turnaround.
I'm in the middle of a product release for us, so I fear I won't be as quick
as you... :-)

I couldn't find a simple flow diagram or similar for LCF with regards
security (probably looking in the wrong place).
Perhaps you could help on these questions...?

In SOLR-1872, the allows and denies are stored (in acl.xml) as sub-queries,
which are then used as filter queries in a user's search.

Are the ACCESS_TOKEN and DENY_TOKEN values whatever have been stored for a
particular user in the underlying acl store (e.g. Active Directory)?
How does AD and/or LCF handle storing such data in its schema? (does AD
needs its schema extended?)
Presumably, any such AD fields would need to be queried for effective rights
in order to cater for group membership allows and denies.

I guess I'm just trying to understand the architectural
flow/storage/retrieval of data in the various parts of the system, but I
admit, I need to do more research on this.
After our product release, when I get a few more spare cycles, I can look at
it in more detail.

Many thanks!
Peter



On Thu, Apr 22, 2010 at 1:02 AM, <karl.wright@nokia.com> wrote:

>  Hi Peter,
>
> I just committed the promised changes to the LCF Solr output connector.
>
> ACL metadata will now be posted to the Solr Http interface along with the
> document as the two following fields:
>
> __ACCESS_TOKEN__document
> __DENY_TOKEN__document
>
> There will, of course, potentially be multiple values for each of these two
> fields.
>
> Hope this helps,
> Karl
>
>  ------------------------------
> *From:* ext Peter Sturge [mailto:peter.sturge@googlemail.com]
> *Sent:* Tuesday, April 20, 2010 6:51 PM
>
> *To:* connectors-user@incubator.apache.org
> *Subject:* Re: FW: Solr and LCF security at query time
>
> Hi Karl,
>
> Thanks for the info. I'll have a look at the link and try to take in as
> much sugar as my insulin levels will handle...
> It sounds like the necessary interface(s) are already in LCF - just a
> matter of implementing them in the Solr 1872 plugin.
> I'll need to digest the LCF stuff to get to grips with it..please bear with
> me while I do that...
>
> When you say:
>    The LCF solr output connection doesn't yet do this, but it is trivial
> for me to make that happen.
> Do you mean a mechanism by which solr.war can get url et al info from its
> parent container (Tomcat, Jetty etc.), or have I misinterpreted this?
>
>
> Thanks,
> Peter
>
>
>
>
> On Tue, Apr 20, 2010 at 11:05 PM, <karl.wright@nokia.com> wrote:
>
>>  Hi Peter,
>>
>> I'm the principal committer for LCF, but I don't know as much about Solr
>> as I ought to, so it sounds like a potentially productive collaboration.
>>
>> LCF does exactly what you are looking for - the only issue at all is that
>> you need to fetch a URL from a webapp to get what you are looking for.  The
>> "plugs" are all inside LCF for different kinds of repositories.  Here's a
>> link that might help with drinking the LCF "koolaid", as it were:
>> https://cwiki.apache.org/confluence/display/CONNECTORS/Lucene+Connectors+Framework+concepts
>>
>> The url would be something like this (on a locally installed tomcat-based
>> LCF instance):
>>
>>
>> http://localhost:8080/lcf-authority-service/UserACLs?username=someusername@somedomain.com
>>
>> ... and this fetch returns something like:
>>
>> TOKEN:xxxxxxx
>> TOKEN:yyyyyyy
>> TOKEN:zzzzzzz
>> ....
>>
>> ... which represent the amalgamated tokens for all of the defined
>> authorities, and by some strange coincidence ( ;-) ) are compatible
>> with certain pieces of metadata that have been passed into Solr with each
>> document - one set of Allow tokens, and a second set of Deny tokens.  The
>> LCF solr output connection doesn't yet do this, but it is trivial for me to
>> make that happen.
>>
>> Does this sound plausible to you?
>>
>> Karl
>>
>>
>>  ------------------------------
>>  *From:* ext Peter Sturge [mailto:peter.sturge@googlemail.com]
>> *Sent:* Tuesday, April 20, 2010 5:41 PM
>> *To:* connectors-user@incubator.apache.org; dev@lucene.apache.org
>>
>> *Subject:* Re: FW: Solr and LCF security at query time
>>
>>   Hi Karl,
>>
>> Integrating LCF to get external token support for SOLR-1872 sounds very
>> interesting indeed. I don't know anything about LCF, but one of the things I
>> was planning for SOLR-1872 is to make acl.xml (or rather its behaviour)
>> 'pluggable' - i.e. it would just be one of a series of plugins that could be
>> used for obtaining back-end authentication information.
>>
>> If you're good with LCF, perhaps we could work together to build this in.
>> One of the first things would be defining an interface that would be as easy
>> as possible to plug LCF into. Have you any suggestions/insight on this
>> front?
>>
>> Many thanks,
>> Peter
>>
>>
>>
>> On Tue, Apr 20, 2010 at 4:08 PM, <karl.wright@nokia.com> wrote:
>>
>>>  SOLR-1872 looks exactly like what I was envisioning, from the search
>>> query perspective, although instead of the acl xml file you specify LCF
>>> stipulates you would dynamically query the lcf-authority-service servlet for
>>> the access tokens themselves.  That would get you support for AD,
>>> Documentum, LiveLink, Meridio, and Memex for free. It seems likely that this
>>> component could be modified to work with LCF with minor effort.
>>>
>>> The missing component still seems to be AD authentication, which needs a
>>> solution.
>>>
>>> Karl
>>>
>>>  ------------------------------
>>> *From:* ext Peter Sturge [mailto:peter.sturge@googlemail.com]
>>> *Sent:* Tuesday, April 20, 2010 10:44 AM
>>> *To:* dev@lucene.apache.org
>>> *Subject:* Re: FW: Solr and LCF security at query time
>>>
>>>   If you want to do this completely within Solr, have a look at:
>>> SOLR-1834 and SOLR-1872. These use a SearchComponent plugin for Solr.
>>>
>>> Thanks,
>>> Peter
>>>
>>>
>>>
>>> On Tue, Apr 20, 2010 at 1:25 PM, <karl.wright@nokia.com> wrote:
>>>
>>>>  FYI
>>>>
>>>>  ------------------------------
>>>> *From:* Wright Karl (Nokia-S/Cambridge)
>>>> *Sent:* Tuesday, April 20, 2010 8:16 AM
>>>> *To:* 'dominique.bejean@eolya.fr'
>>>> *Cc:* 'solr-dev@apache.org'; 'connectors-dev@incubator.apache.org'; '
>>>> connectors-user@incubator.apache.org'
>>>> *Subject:* RE: Solr and LCF security at query time
>>>>
>>>>   Dominique,
>>>>
>>>> Yes, I am aware of this ticket and contribution.  Luckily LCF
>>>> establishes a powerful multi-repository security model, even though it
>>>> doesn't yet do the final step of enforcing that model at the search end.
>>>> LCF allows you to define multiple authorities to operate against disparate
>>>> repositories, and use the appropriate authority to secure any given
>>>> document.  The solr people are aware of this design, which addresses the
>>>> issues raised by SOLR-1834 very nicely.  However, as I said before, time
is
>>>> a problem, and the work still needs to be done.
>>>>
>>>> I suggest you read up on the actual security model of LCF, and perhaps
>>>> experiment with that and the SOLR-1834 contribution, to see if there is
>>>> common ground.  One thing we've learned at MetaCarta is that post-filtering
>>>> for security purposes is expensive, and it is better to modify the queries
>>>> themselves to restrict the results, if possible.  I'm not sure which
>>>> approach SOLR-1834 takes, although it sounds like it might be the filtering
>>>> approach.  Still, it would be better than nothing.
>>>>
>>>> Please let me know what you find out.
>>>>
>>>> Thanks,
>>>> Karl
>>>>
>>>>  ------------------------------
>>>> *From:* ext Dominique Bejean [mailto:dominique.bejean@eolya.fr]
>>>> *Sent:* Tuesday, April 20, 2010 8:03 AM
>>>> *To:* Wright Karl (Nokia-S/Cambridge)
>>>> *Cc:* connectors-user@incubator.apache.org;
>>>> connectors-dev@incubator.apache.org
>>>> *Subject:* Re: Solr and LCF security at query time
>>>>
>>>> Karl,
>>>>
>>>> Thank you for your reply.
>>>>
>>>> I made some research today and I found this :
>>>> http://freesurf001.appspot.com/issues.apache.org/jira/browse/SOLR-1834
>>>> http://demo.findwise.se:8880/SolrSecurity/
>>>>
>>>> Sorl security model have to be able to filter result list with items
>>>> coming from various sources at the same time (livelink, documentum, file
>>>> system, ...). Big subject :)
>>>>
>>>> Dominique
>>>>
>>>>
>>>> Le 20/04/10 13:34, karl.wright@nokia.com a écrit :
>>>>
>>>> Hi Dominique,
>>>>
>>>> At the moment, in order to enforce the LCF security model within
>>>> Lucene/Solr, you will need to build this functionality into whatever client
>>>> you are using to display the Lucene search results.  Specifically, you would
>>>> need to take the following steps:
>>>>
>>>> (1) Have your users access your search client through Apache.
>>>> (2) Use the Apache module mod_auth_kerb, combined with LCF's
>>>> mod_authz_annotate, to cause authorization HTTP headers to be transmitted
to
>>>> the client webapp.
>>>> (3) Have your client webapp alter whatever queries it is doing, to add
>>>> an appropriate query clause for each of the access tokens transmitted in
the
>>>> headers.
>>>>
>>>> (This is how it is done at MetaCarta.)
>>>>
>>>> Alternatively, you may find a way to do this completely with a web
>>>> application under a Java app server such as Tomcat.  I have not yet done
the
>>>> research to find out whether this is a feasible alternative.  Effectively,
>>>> what you need something like mod_auth_kerb to do is to authenticate your
>>>> user against Active Directory, or whomever the authenticator ought to be.
>>>> JAAS may be helpful here.
>>>>
>>>> There are, of course, intentions to fill out the missing pieces more
>>>> completely and transparently via a Solr search plugin and/or filter.  What
>>>> has been lacking is time.  If you are in a position to do development in
>>>> this area, we're happy to have any assistance you might provide.
>>>>
>>>> Thanks,
>>>> Karl
>>>>  ------------------------------
>>>> *From:* ext Dominique Bejean [mailto:dominique.bejean@eolya.fr<dominique.bejean@eolya.fr>]
>>>>
>>>> *Sent:* Tuesday, April 20, 2010 5:06 AM
>>>> *To:* connectors-user@incubator.apache.org
>>>> *Subject:* Solr and LCF security at query time
>>>>
>>>> Hi,
>>>>
>>>> I don't see in LCF wiki how Solr and LCF works together at query time in
>>>> order to remove from the result list the items the user is not allowed to
>>>> access.
>>>>
>>>> In
>>>> http://cwiki.apache.org/CONNECTORS/lucene-connectors-framework-concepts.html,
>>>> I just see these sentences :
>>>>
>>>> " Once all these documents and their access tokens are handed to the
>>>> search engine, it is the search engine's job to enforce security by
>>>> excluding inappropriate documents from the search results. For *Lucene*,
>>>> this infrastructure is expected to be built on top of Lucene's generic
>>>> metadata abilities, but has not been implemented at this time."
>>>>
>>>> I am not sure to understand. Does this mean that for the moment, it is
>>>> not possible for Solr to apply security by using an Authority Connector ?
>>>>
>>>> Dominique
>>>>
>>>>
>>>
>>
>

Mime
View raw message