manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject Re: Question about obtaining metadata values via CMIS connector => ElasticSearch
Date Wed, 06 May 2015 16:19:23 GMT
Here's the key finding:

"Ok, the problem is because you only get to write the seeding query. The

query that fetches individual documents is hardwired.  I believe it is set
in opencmis in fact."


So basically, for the CMIS connector, you aren't writing the query
that finds the document data and metadata; you are writing the query
that finds the set of documents to index.  And the query you *need* to
modify is in fact baked into some jar in Apache Chemistry, which
greatly limits the CMIS connector's utility for indexing metadata.


Is there any way you can use one of the two the native Alfresco
connectors we supply?


Karl



On Wed, May 6, 2015 at 12:10 PM, Karl Wright <daddywri@gmail.com> wrote:

> Hi Deanna,
>
> I vaguely recall that Apache Chemistry (which the CMIS connector relies
> on) running against Alfresco has some limitations where metadata is
> concerned.  I'm pretty sure there was an email exchange posted somewhere,
> so you might be able to dig it up here:
>
> http://www.mail-archive.com/user@manifoldcf.apache.org/index.html
>
> I'll look around and see.
>
> The other potential problem is your ElasticSearch configuration.  I don't
> know a lot about this myself.  I think it makes sense to try to figure out
> on which end the problem lies; if you can see in some log what actually
> gets posted to ElasticSearch for each document, that would help.
>
> Karl
>
>
> On Wed, May 6, 2015 at 11:42 AM, Delapasse, Deanna <
> ddelapasse@oceaneering.com> wrote:
>
>> Hi,
>>
>> I'm trying to use ManifoldCF to crawl my Alfresco repo (via the CMIS
>> connector) and push the results into ElasticSearch.  My users want to
>> search metadata (including custom) and content. I followed some tutorials
>> and got it running quickly BUT...regardless of my ElasticSearch mapping the
>> only CMIS metadata entity I can find in my indexed results is cmis:objectId.
>>
>> I have tried using various cmis queries (with 'select * ...' and with
>> 'select cmis:name, cmis:lastModifiedBy, ...'.  I have verified my queries
>> and they definitely return metadata, but the data doesn't appear in
>> ElasticSearch.   I tried a simple attachment mapping and also a mapping
>> where I specifically list some of the cmis properties. Regardless of
>> mapping, my indexes look like this:
>>
>>
>> {
>>    "_index":"test",
>>    "_type":"file",
>>    "_id":"
>> http://localhost:8080/alfresco/api/-default-/public/cmis/versions/1.1/atom/content/10.0.txt?id=2555a540-a5b3-4c27-90f6-c89b6742bd4f%3B1.0
>> ",
>>    "_version":2,
>>    "_score":1,
>>    "_source":{
>>       "cmis:objectId":"2555a540-a5b3-4c27-90f6-c89b6742bd4f;1.0",
>>       "allow_token_document":"__nosecurity__",
>>       "deny_token_document":"__nosecurity__",
>>       "allow_token_share":"__nosecurity__",
>>       "deny_token_share":"__nosecurity__",
>>       "allow_token_parent":"__nosecurity__",
>>       "deny_token_parent":"__nosecurity__",
>>       "file":{
>>          "_content_type":"text/plain",
>>          "_name":"10.0.txt",
>>          "_content":"DQpJIGFtIGFuIEFsZnJlc2NvIGZpbGUuDQo="
>>       }
>>    }
>> }
>>
>> The ES results are good and I can search perfectly by content &
>> cmis:objectId.  I have enabled debugging and no errors appear in the log.  *What
>> do I have to DO to get cmis:name, cmis:lastModifiedBy and other properties
>> to appear?*
>>
>> Thanks in advance!  This product is very simple to use and has potential
>> to be a HUGE help to us!!!
>>
>> Deanna
>>
>
>

Mime
View raw message