manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject Re: Sharepoint SID extraction for groups
Date Sun, 17 Nov 2013 16:54:53 GMT
Hi Will,

I looked at the pivot exception, but it seems like it *is* detecting that
it should retry the transaction, and is indeed retrying.  This is expected
behavior.  You did not include the beginning of the message; if it was
DEBUG or WARNING I would be comfortable that it was doing the right thing.

Somewhere else, though, there may well be an actual database ERROR that is
causing the system to get hung.  This will show up as an ERROR in the log,
and when you do a thread dump, all the worker threads will be waiting on
something in WorkerResetManager.  Could you include more of the log so that
I can have a look at this?

Thanks,
Karl



On Sat, Nov 16, 2013 at 2:06 PM, Karl Wright <daddywri@gmail.com> wrote:

> Hi will,
> The long running query is not fatal - it is just a warning.
>
> The very-long sid list requires a SharePoint authority, as discussed.
>
> The pivot error sounds like it is something that can be addressed
> though.  Please create a ticket and put the full exception into it,
> and I will look at it either tomorrow or Monday.
>
> Thanks,
> Karl
>
> Sent from my Windows Phone
>
> -----Original Message-----
> From: Will Parkinson
> Sent: 11/16/2013 10:10 AM
> To: user@manifoldcf.apache.org
> Subject: Re: Sharepoint SID extraction for groups
>
>
>
>
>
>
>
>
> Hi Karl,
>
>
> Yeah that seems to be be case, to get ManifoldCF to work in my case i
> just created a separate class to obtain all the user SID's directly
> from AD if the group assigned in Sharepoint is an AD group.  This
> seems to work fine for now, but it seems to be causing a few database
> issues.
>
> First of all, some of the SID lists are up to 1.5MB, which seems to be
> causing the carrydown table to become huge.  I am also getting errors
> like
>
> 1C159E0: ERROR: could not serialize access due to read/write
> dependencies among transactions
>    Detail: Reason code: Canceled on identification as a pivot, during
> conflict in checking.
>   Hint: The transaction might succeed if retried.; sleeping for 56816 ms
> org.apache.manifoldcf.core.interfaces.ManifoldCFException: ERROR:
> could not serialize access due to read/write dependencies among
> transactions
>    Detail: Reason code: Canceled on identification as a pivot, during
> conflict in checking.
>   Hint: The transaction might succeed if retried.
>         at
> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.reinterpretException(DBInterfacePostgreSQL.java:622)
>          at
> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performModification(DBInterfacePostgreSQL.java:651)
>         at
> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performInsert(DBInterfacePostgreSQL.java:187)
>          at
> org.apache.manifoldcf.core.database.BaseTable.performInsert(BaseTable.java:68)
>         at
> org.apache.manifoldcf.crawler.jobs.Carrydown.recordCarrydownDataMultiple(Carrydown.java:343)
>         at
> org.apache.manifoldcf.crawler.jobs.JobManager.addDocuments(JobManager.java:4174)
>          at
> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.processDocumentReferences(WorkerThread.java:2017)
>         at
> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.flush(WorkerThread.java:1948)
>          at
> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:562)
> Caused by: org.postgresql.util.PSQLException: ERROR: could not
> serialize access due to read/write dependencies among transactions
>    Detail: Reason code: Canceled on identification as a pivot, during
> conflict in checking.
>   Hint: The transaction might succeed if retried.
>         at
> org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2102)
>          at
> org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1835)
>         at
> org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:257)
>         at
> org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:500)
>          at
> org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:388)
>         at
> org.postgresql.jdbc2.AbstractJdbc2Statement.executeUpdate(AbstractJdbc2Statement.java:334)
>         at
> org.apache.manifoldcf.core.database.Database.execute(Database.java:883)
>          at
> org.apache.manifoldcf.core.database.Database$ExecuteQueryThread.run(Database.java:683)
>
> And then i eventually get an error like this
>
>  WARN 2013-11-17 00:41:09,058 (Finisher thread) - Found a long-running
> query (77260 ms): [SELECT id FROM jobs WHERE status IN (?,?,?,?,?) FOR
> UPDATE]
>   WARN 2013-11-17 00:41:09,059 (Finisher thread) -   Parameter 0: 'A'
>  WARN 2013-11-17 00:41:09,059 (Finisher thread) -   Parameter 1: 'W'
>  WARN 2013-11-17 00:41:09,059 (Finisher thread) -   Parameter 2: 'R'
>   WARN 2013-11-17 00:41:09,059 (Finisher thread) -   Parameter 3: 'O'
>  WARN 2013-11-17 00:41:09,059 (Finisher thread) -   Parameter 4: 'U'
>  WARN 2013-11-17 00:41:09,060 (Finisher thread) -  Plan: LockRows
> (cost=0.00..3.34 rows=5 width=14) (actual time=0.026..0.027 rows=1
> loops=1)
>   WARN 2013-11-17 00:41:09,060 (Finisher thread) -  Plan:   ->  Seq
> Scan on jobs  (cost=0.00..3.29 rows=5 width=14) (actual
> time=0.024..0.024 rows=1 loops=1)
>  WARN 2013-11-17 00:41:09,060 (Finisher thread) -  Plan:
> Filter: (status = ANY ('{A,W,R,O,U}'::bpchar[]))
>   WARN 2013-11-17 00:41:09,060 (Finisher thread) -  Plan:         Rows
> Removed by Filter: 17
>  WARN 2013-11-17 00:41:09,060 (Finisher thread) -  Plan: Total runtime:
> 0.058 ms
>  WARN 2013-11-17 00:41:09,060 (Finisher thread) -
>
> And then the update stops completely, even though the status on the
> "Status and job management page" is still set as "running".  Do you
> have any ideas on how i can fix this?
>
> I am doing some research at the moment on the best way to store
> permissions information without storing 100's of SID's.
>
> Cheers,
>
> Will
>
>
>
>
> On Wed, Nov 6, 2013 at 11:42 PM, Karl Wright <daddywri@gmail.com> wrote:
>
>
> I should also add that, as far as ActiveDirectory groups go, my
> understanding is that in non-Claim-Space versions of SharePoint,
> there's a SharePoint group created for each AD group.  So a SharePoint
> user will belong to some native SharePoint groups, but also to some
> "mirrored" SharePoint groups that are created because of the user's
> group relationships in AD.
>
> Claim Space seems to change this in the following way: SharePoint
> groups no longer mirror AD groups.  Instead, the Claim Space
> authorization tokens implicitly describe the relationships.  So you
> have to talk to both SharePoint AND AD in order to fully understand
> what documents in SharePoint are authorized for what users.
>
> Karl
>
>
>
>
>
>
>
>
> On Wed, Nov 6, 2013 at 8:37 AM, Karl Wright <daddywri@gmail.com> wrote:
>
>
>
>
>
>
>
>
>
> Hi Will,
>
>
>
> The current connector indeed maps SharePoint groups to individual user
> SIDs.  That is not terribly scalable, and it is one reason why I've
> created dedicated SharePoint authorities in the CONNECTORS-754-2
> branch, so that we can authorize documents at a group level.
>
>
> I've also done considerable research on the ClaimSpace security model.
>  Supporting it fully has required some modifications to the basic
> authorization model that ManifoldCF uses to tie documents to
> authorities.  This basic work is done and is now part of trunk as
> well.  And the documentation has been updated to describe the revised
> authorization model.
>
> If you want to try working with the CONNECTORS-754-2 branch, I'd be
> very happy to interact with you to iron out any problems.  What you
> will need to do if you try it out is the following:
>
> (1) Create an authority group for your SharePoint instance
> (2) Create a "SharePoint/Native" authority tied to that authority group
> (3) If this is a claim-space SharePoint instance, then also create a
> "SharePoint/Active Directory" authority tied to the same authority
> group
> (4) Create your SharePoint repository connection, making sure to
> select "Native" mode
>
> The implementation is currently the best I can do in the absence of a
> full-blown Claim Space instance.  Even so, there are still questions
> in my mind that, if I could solve them, would help clarify the
> implementation.  For example, what "Role Definitions" do - are they
> essentially just another form of group?  And, whether it is better to
> use a user, group, or role definition's name for an access token, or
> the ID?  Perhaps you can clarify a bit, I don't know...
>
>
> Thanks,
> Karl
>
>
>
>
>
>
>
> On Wed, Nov 6, 2013 at 8:14 AM, Will Parkinson <parkinson.will@gmail.com>
> wrote:
>
>
>
>
>
>
> Hello,
>
>
> I am just wondering how the extraction of the groups permissions works
> for the sharepoint connector.  From what I can see, it seems that the
> group is determined via the MCPermissions.asmx web service and then
> each user in that group is iterated over and the SID for those users
> are extracted.
>
> Is this the case?  If so, are groups created in Sharepoint and AD
> groups treated the same way?
>
> Cheers,
>
> Will
>

Mime
View raw message