manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Piergiorgio Lucidi <piergior...@apache.org>
Subject Re: Alfresco + ManifoldCF + ElasticSearch index mapping
Date Fri, 31 Jul 2015 14:13:31 GMT
Hi guys,

I have checked in the code and I can confirm that on the CMIS side the
connector is correctly considering all the document properties. So probably
it should be a problem on the ElasticSearch side.

I'm checking on the code now, I hope to have some news for you soon :-P

Regards,
Piergiorgio

2015-07-31 15:32 GMT+02:00 Delapasse, Deanna <ddelapasse@oceaneering.com>:

> My memory is a bit fuzzy, but I think this is accurate!
>
> Several months back I got a copy of the CMIS connector.  It worked, but
> the query results were restricted (ie even though you gave "select
> a,b,c,d..".   it ignored all the fields except "cmis:name &
> cmis:objectId" so none of the attributes actually made it to ES).   Maybe
> that has been fixed, but you should check!  Add some debugging into
> CmisRepositoryConnector to confirm.
>
> Install a tool call Elasticsearch Head so you can browse and see exactly
> what is being pushed into ES.  That was what helped me figure it out.
>
> Good luck!
> Deanna
>
> p.s.  I also made a change to the cmis check for "changed" files.  Editing
> metadata doesn't always create a new version, so I modified it to check for
> LastModifiedDate instead of comparing version #s.
>
>
> On Fri, Jul 31, 2015 at 8:02 AM, Francesco Fornasari <f.fornasari@tai.it>
> wrote:
>
>>
>> Dear All,
>> we're newbie about ManifoldCF and ElasticSearch.
>>
>> We configured successfully the ManifoldCF + ElasticSearch + Alfresco
>> 5.0.1 architecture. But, we aren't able to get metadata from ElasticSearch
>> index.
>>
>> In details:
>>
>> 1) we configured a cmis query in ManifoldCF "SELECT mcf:numeromittente,
>> mcf:pagine FROM mcf:fax" for the job crowler.
>> 2) the job works right and founds documents.
>> 3) we configured the elastic search index mapping,
>> http://manifoldcf-es.tainet:9200/index/generictype/_mapping :
>>
>> {
>>   "generictype" : {
>>     "properties" : {
>>       "mcf:numeromittente" : {
>>         "type" : "string"
>>       },
>>       "mcf:pagine" : {
>>         "type" : "integer"
>>       }
>>     }
>>   }
>> }
>>
>> 4) the http://manifoldcf-es.tainet:9200/index/generictype/_mapping API
>> returns a JSON that includes all subset of properties
>>
>> {"properties":{"_content":{"type":"string"},"_content_type":{"type":"string"},"_name":{"type":"string"}}},"mcf:numeromittente":{"type":"string"},"mcf:pagine":{"type":"integer"}}}}
>>
>>
>> 5) we call the http://manifoldcf-es.tainet:9200/index/generictype/_search
>> API :
>>
>> {
>>     "query": {
>>         "query_string": {
>>             "query": "v2.5.6_ReleaseNotes.txt",
>>             "fields": []
>>         }
>>     }
>> }
>>
>> but the API returns a JSON that doesn't contain mcf property values:
>>
>> "file" : {"_content_type" : "text\/plain","_name" :
>> "v2.5.6_ReleaseNotes.txt", "_content" : "...." }
>>
>> So, could you explain us how we can include into the elasitc search query
>> result also mcf:numeromittente and mcf:pagine properties?
>>
>> There is something wrong in your ManifoldCF configuration?
>>
>> Regards,
>> Francesco.
>>
>> --
>> Piergiorgio Lucidi
>> Open Source ECM Specialist
>> http://www.open4dev.com
>>
>

Mime
View raw message