manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject Re: metadata problem for subsite libraries
Date Wed, 12 Mar 2014 13:21:06 GMT
Hi Ahmet,

I misspoke; the rules for metadata pay attention only to a path.

The only way we can make progress here is to do some debugging.  In your
trunk checkout, have a look at SharePointRepository.java starting at line
993:

>>>>>>
            // == Document path ==
            // Convert the modified document path to an unmodified one,
plus a library path.
            String decodedLibPath =
documentIdentifier.substring(0,dLibSeparatorIndex);
            String decodedDocumentPath = decodedLibPath +
documentIdentifier.substring(dLibSeparatorIndex+1);
            if (checkIncludeFile(decodedDocumentPath,spec))
            {
              // This file is included, so calculate a version string.
This will include metadata info, so get that first.
              MetadataInformation metadataInfo =
getMetadataSpecification(decodedDocumentPath,spec);

<<<<<<

The class MetadataInformation describes the metadata that will be included
given the document path.  Later, at line 1023, specified fields that are
also part of the library the document is in are found:

>>>>>>
                String[] sortedMetadataFields =
getInterestingFieldSetSorted(metadataInfo,libFields);
<<<<<<

I suggest modifying the connector to print the contents of
sortedMetadataFields for each document that comes along.  You will need to
do whatever necessary to force the recrawl of just those documents whose
metadata you are not getting.  If sortedMetadataFields does not contain the
fields you expect, that means that there is something wrong with how the
rules are being interpreted, or in how the fields for the library are being
discovered.  If it contains the right fields, then the problem must be in
how the field names are getting packed and unpacked from the version
string.  Either way, please let me know.

Karl



On Wed, Mar 12, 2014 at 9:10 AM, Ahmet Arslan <iorixxx@yahoo.com> wrote:

> Hi Karl,
>
> I am sorry but I don't follow. I assume, in my config, Paths/PathRule is
> correct since it fetches documents (with no metadata).
>
> In meta data section, there is no place for 'entity type'.
>
> Can you please elaborate?
>
> Thanks,
> Ahmet
>
> On Wednesday, March 12, 2014 2:57 PM, Karl Wright <daddywri@gmail.com>
> wrote:
>
> To clarify: Rules you define must match both the entity type (e.g. site,
> list, lib, or document), as well as the path.  So the example you provided,
> since it does not specify the entity type, is incomplete.
>
> Karl
>
>
>
>
>
> On Wed, Mar 12, 2014 at 8:44 AM, Karl Wright <daddywri@gmail.com> wrote:
>
> Hi Ahmet,
> >
> >All I can remember about this coming up before involved people not having
> appropriate metadata rules.  So if you include a screen shot of your
> metadata rules, that ought to help clarify what is happening.
> >
> >FWIW, metadata for a library will require you to have an explicit
> matching library rule on your metadata tab.  Since this is a subsite, you
> will also need a site rule.
> >
> >Thanks,
> >Karl
> >
> >
> >
> >
> >
> >On Wed, Mar 12, 2014 at 8:35 AM, Ahmet Arslan <iorixxx@yahoo.com> wrote:
> >
> >Hi,
> >>
> >>I am connection a SharePoint 2010 instance with both trunk and
> ManifoldCF 1.5.1 version.
> >>
> >>When I define a job to crawl a document library by "add site", no
> MetaData is sent to output connector. I can see list of metadata and select
> them. But only GUID (although I don't select GUID nor it is listed in the
> list) is sent. Documents are indexed but no metadata.
> >>
> >>There is no metadata problem with Lists.
> >>
> >>
> >>'Document Library' Example
> >>/site1/site2/Documents/* does not honour selected MetaData.
> >>/Documents/* honurs selected MetaData.
> >>
> >>I think someone has reported similar  problems (for document library
> under {sub}(site) in the past but I couldn't find the e-mail or jira.
> >>
> >>Thanks,
> >>Ahmet
> >>
> >
>

Mime
View raw message