manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Delapasse, Deanna" <ddelapa...@oceaneering.com>
Subject Re: Question about obtaining metadata values via CMIS connector => ElasticSearch
Date Wed, 06 May 2015 17:22:54 GMT
I will check into the Alfresco connector.  Thanks for your help!

Deanna




On Wed, May 6, 2015 at 11:19 AM, Karl Wright <daddywri@gmail.com> wrote:

> Here's the key finding:
>
> "Ok, the problem is because you only get to write the seeding query. The
>
> query that fetches individual documents is hardwired.  I believe it is set
> in opencmis in fact."
>
>
> So basically, for the CMIS connector, you aren't writing the query that finds the document
data and metadata; you are writing the query that finds the set of documents to index.  And
the query you *need* to modify is in fact baked into some jar in Apache Chemistry, which greatly
limits the CMIS connector's utility for indexing metadata.
>
>
> Is there any way you can use one of the two the native Alfresco connectors we supply?
>
>
> Karl
>
>
>
> On Wed, May 6, 2015 at 12:10 PM, Karl Wright <daddywri@gmail.com> wrote:
>
>> Hi Deanna,
>>
>> I vaguely recall that Apache Chemistry (which the CMIS connector relies
>> on) running against Alfresco has some limitations where metadata is
>> concerned.  I'm pretty sure there was an email exchange posted somewhere,
>> so you might be able to dig it up here:
>>
>> http://www.mail-archive.com/user@manifoldcf.apache.org/index.html
>>
>> I'll look around and see.
>>
>> The other potential problem is your ElasticSearch configuration.  I don't
>> know a lot about this myself.  I think it makes sense to try to figure out
>> on which end the problem lies; if you can see in some log what actually
>> gets posted to ElasticSearch for each document, that would help.
>>
>> Karl
>>
>>
>> On Wed, May 6, 2015 at 11:42 AM, Delapasse, Deanna <
>> ddelapasse@oceaneering.com> wrote:
>>
>>> Hi,
>>>
>>> I'm trying to use ManifoldCF to crawl my Alfresco repo (via the CMIS
>>> connector) and push the results into ElasticSearch.  My users want to
>>> search metadata (including custom) and content. I followed some tutorials
>>> and got it running quickly BUT...regardless of my ElasticSearch mapping the
>>> only CMIS metadata entity I can find in my indexed results is cmis:objectId.
>>>
>>> I have tried using various cmis queries (with 'select * ...' and with
>>> 'select cmis:name, cmis:lastModifiedBy, ...'.  I have verified my queries
>>> and they definitely return metadata, but the data doesn't appear in
>>> ElasticSearch.   I tried a simple attachment mapping and also a mapping
>>> where I specifically list some of the cmis properties. Regardless of
>>> mapping, my indexes look like this:
>>>
>>>
>>> {
>>>    "_index":"test",
>>>    "_type":"file",
>>>    "_id":"
>>> http://localhost:8080/alfresco/api/-default-/public/cmis/versions/1.1/atom/content/10.0.txt?id=2555a540-a5b3-4c27-90f6-c89b6742bd4f%3B1.0
>>> ",
>>>    "_version":2,
>>>    "_score":1,
>>>    "_source":{
>>>       "cmis:objectId":"2555a540-a5b3-4c27-90f6-c89b6742bd4f;1.0",
>>>       "allow_token_document":"__nosecurity__",
>>>       "deny_token_document":"__nosecurity__",
>>>       "allow_token_share":"__nosecurity__",
>>>       "deny_token_share":"__nosecurity__",
>>>       "allow_token_parent":"__nosecurity__",
>>>       "deny_token_parent":"__nosecurity__",
>>>       "file":{
>>>          "_content_type":"text/plain",
>>>          "_name":"10.0.txt",
>>>          "_content":"DQpJIGFtIGFuIEFsZnJlc2NvIGZpbGUuDQo="
>>>       }
>>>    }
>>> }
>>>
>>> The ES results are good and I can search perfectly by content &
>>> cmis:objectId.  I have enabled debugging and no errors appear in the log.  *What
>>> do I have to DO to get cmis:name, cmis:lastModifiedBy and other properties
>>> to appear?*
>>>
>>> Thanks in advance!  This product is very simple to use and has potential
>>> to be a HUGE help to us!!!
>>>
>>> Deanna
>>>
>>
>>
>

Mime
View raw message