manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maurizio Pillitu <mauri...@session.it>
Subject Re: Question about obtaining metadata values via CMIS connector => ElasticSearch
Date Mon, 11 May 2015 14:16:41 GMT
Hi Deanna,
sorry for the late reply.

The source code of the AMP can be found at
https://github.com/maoo/alfresco-indexer ; my first advise would be to
check if the new webscripts are accessible on Alfresco; you can access via
http://localhost:8090/alfresco/service and "browse all webscripts".

If you find the Alfresco Indexer webscripts, you can try to invoke them
(for example,
http://localhost:8090/alfresco/service/node/changes/workspace/SpacesStore)

If this works, it means Alfresco Indexer is responding correctly, therefore
the issue lies on the Manifold side; as soon as you validate the mentioned
steps, we can move forward with the debugging.

Thanks,
  mao


On Mon, May 11, 2015 at 3:43 PM Karl Wright <daddywri@gmail.com> wrote:

> Hi Deanna,
>
> I have contacted the author of the plugin, who works for Alfresco.  In
> ManifoldCF we distribute only the AMP binary, so Maurizio would be the
> right guy to answer any source questions.
>
> Thanks,
> Karl
>
>
> On Mon, May 11, 2015 at 9:27 AM, Delapasse, Deanna <
> ddelapasse@oceaneering.com> wrote:
>
>> The Alfresco Webscripts connector requires an AMP installed into the
>> Alfresco server to provide the webscripts the connector calls.  The
>> connector's author pointed me to his GitHub source code, but it isn't
>> working for me as-is (installs ok, but the included webscripts aren't
>> accessible).  Are the AMP sources available from MCF?  And do you know the
>> last Alfresco version that anyone used it with? Possibly I will need to
>> tweak it to work with my Alfresco 4.2.f.
>>
>> thanks!
>> Deanna
>>
>>
>> On Wed, May 6, 2015 at 11:19 AM, Karl Wright <daddywri@gmail.com> wrote:
>>
>>> Here's the key finding:
>>>
>>> "Ok, the problem is because you only get to write the seeding query. The
>>>
>>> query that fetches individual documents is hardwired.  I believe it is set
>>> in opencmis in fact."
>>>
>>>
>>> So basically, for the CMIS connector, you aren't writing the query that finds
the document data and metadata; you are writing the query that finds the set of documents
to index.  And the query you *need* to modify is in fact baked into some jar in Apache Chemistry,
which greatly limits the CMIS connector's utility for indexing metadata.
>>>
>>>
>>> Is there any way you can use one of the two the native Alfresco connectors we
supply?
>>>
>>>
>>> Karl
>>>
>>>
>>>
>>> On Wed, May 6, 2015 at 12:10 PM, Karl Wright <daddywri@gmail.com> wrote:
>>>
>>>> Hi Deanna,
>>>>
>>>> I vaguely recall that Apache Chemistry (which the CMIS connector relies
>>>> on) running against Alfresco has some limitations where metadata is
>>>> concerned.  I'm pretty sure there was an email exchange posted somewhere,
>>>> so you might be able to dig it up here:
>>>>
>>>> http://www.mail-archive.com/user@manifoldcf.apache.org/index.html
>>>>
>>>> I'll look around and see.
>>>>
>>>> The other potential problem is your ElasticSearch configuration.  I
>>>> don't know a lot about this myself.  I think it makes sense to try to
>>>> figure out on which end the problem lies; if you can see in some log what
>>>> actually gets posted to ElasticSearch for each document, that would help.
>>>>
>>>> Karl
>>>>
>>>>
>>>> On Wed, May 6, 2015 at 11:42 AM, Delapasse, Deanna <
>>>> ddelapasse@oceaneering.com> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I'm trying to use ManifoldCF to crawl my Alfresco repo (via the CMIS
>>>>> connector) and push the results into ElasticSearch.  My users want to
>>>>> search metadata (including custom) and content. I followed some tutorials
>>>>> and got it running quickly BUT...regardless of my ElasticSearch mapping
the
>>>>> only CMIS metadata entity I can find in my indexed results is cmis:objectId.
>>>>>
>>>>> I have tried using various cmis queries (with 'select * ...' and with
>>>>> 'select cmis:name, cmis:lastModifiedBy, ...'.  I have verified my queries
>>>>> and they definitely return metadata, but the data doesn't appear in
>>>>> ElasticSearch.   I tried a simple attachment mapping and also a mapping
>>>>> where I specifically list some of the cmis properties. Regardless of
>>>>> mapping, my indexes look like this:
>>>>>
>>>>>
>>>>> {
>>>>>    "_index":"test",
>>>>>    "_type":"file",
>>>>>    "_id":"
>>>>> http://localhost:8080/alfresco/api/-default-/public/cmis/versions/1.1/atom/content/10.0.txt?id=2555a540-a5b3-4c27-90f6-c89b6742bd4f%3B1.0
>>>>> ",
>>>>>    "_version":2,
>>>>>    "_score":1,
>>>>>    "_source":{
>>>>>       "cmis:objectId":"2555a540-a5b3-4c27-90f6-c89b6742bd4f;1.0",
>>>>>       "allow_token_document":"__nosecurity__",
>>>>>       "deny_token_document":"__nosecurity__",
>>>>>       "allow_token_share":"__nosecurity__",
>>>>>       "deny_token_share":"__nosecurity__",
>>>>>       "allow_token_parent":"__nosecurity__",
>>>>>       "deny_token_parent":"__nosecurity__",
>>>>>       "file":{
>>>>>          "_content_type":"text/plain",
>>>>>          "_name":"10.0.txt",
>>>>>          "_content":"DQpJIGFtIGFuIEFsZnJlc2NvIGZpbGUuDQo="
>>>>>       }
>>>>>    }
>>>>> }
>>>>>
>>>>> The ES results are good and I can search perfectly by content &
>>>>> cmis:objectId.  I have enabled debugging and no errors appear in the
log.  *What
>>>>> do I have to DO to get cmis:name, cmis:lastModifiedBy and other properties
>>>>> to appear?*
>>>>>
>>>>> Thanks in advance!  This product is very simple to use and has
>>>>> potential to be a HUGE help to us!!!
>>>>>
>>>>> Deanna
>>>>>
>>>>
>>>>
>>>
>>
>

Mime
View raw message