manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Delapasse, Deanna" <ddelapa...@oceaneering.com>
Subject Re: Question about obtaining metadata values via CMIS connector => ElasticSearch
Date Tue, 12 May 2015 02:22:24 GMT
Mao,  Sorry for the delay!  Today just did NOT go as planned :-(.  Happy to
supply anything else that might help.  The gist is that I am able to invoke
the http://localhost:8080/alfresco/service/auth/resolve/admin manually (it
has me login and then returns credentials), but seems like Manifold is
unable to reach Alfresco successfully.

---------------------------------------------------------- status
--------------------------------------------------------------------------------------
Installed & verified amp is installed.
http://localhost:8080/alfresco/service
<http://localhost:8090/alfresco/service> IS returning the maoo namespace
methods! (response pasted at end of email.)

I create the repo connection.   Selected Connection type: "Alfresco
Webscript" and no authority group.  On the server page I left the defaults:
http
localhost
8080
/alfresco/service
workspace
SpacesStore  <=== I tried leaving this and also adding SpacesStore/nodeID
but didn't help.
and then user/password

But as soon as I clicked save I see these errors in the command window:

Starting crawler...
============
http
localhost
8080
/alfresco/service
workspace
SpacesStore
admin
XXXXX
============
[qtp52962163-616] WARN org.eclipse.jetty.servlet.ServletHandler -
org.apache.jasper.JasperException: An exception occurred processing JSP
page /ex
ecute.jsp at line 169
166:
connManager.save(connection);
167:
variableContext.setParameter("connname",connectionName);
168: %>
169:                                            <jsp:forward
page="viewconnection.jsp"/>
170: <%
171:                                    }
172:                            }

Stacktrace:
        at
org.apache.jasper.servlet.JspServletWrapper.handleJspException(JspServletWrapper.java:521)
        at
org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:412)
        at
org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:313)
        at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:260)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
        at
org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:769)
        at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
        at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
        at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)
        at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
        at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1125)
        at
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
        at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
        at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1059)
        at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
        at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215)
        at
org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:52)
        at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
        at org.eclipse.jetty.server.Server.handle(Server.java:497)
        at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:311)
        at
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:248)
        at
org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)
        at
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:610)
        at
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:539)
        at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.jasper.JasperException: An exception occurred
processing JSP page /viewconnection.jsp at line 121


==============manifoldCF log
DEBUG 2015-05-11 21:04:22,603 (qtp380224087-322) - Opening connection {}->
http://localhost:8080
DEBUG 2015-05-11 21:04:22,605 (qtp380224087-322) - Connecting to localhost/
127.0.0.1:8080
DEBUG 2015-05-11 21:04:22,607 (qtp380224087-322) - Connection established
127.0.0.1:60824<->127.0.0.1:8080
DEBUG 2015-05-11 21:04:22,607 (qtp380224087-322) - Executing request GET
/alfresco/service/api/node/auth/resolve/admin HTTP/1.1
DEBUG 2015-05-11 21:04:22,607 (qtp380224087-322) - Proxy auth state:
UNCHALLENGED
DEBUG 2015-05-11 21:04:22,608 (qtp380224087-322) - http-outgoing-0 >> GET
/alfresco/service/api/node/auth/resolve/admin HTTP/1.1
DEBUG 2015-05-11 21:04:22,608 (qtp380224087-322) - http-outgoing-0 >>
Accept: application/json
DEBUG 2015-05-11 21:04:22,608 (qtp380224087-322) - http-outgoing-0 >>
Authorization: Basic YWRtaW46YWRtaW4=
DEBUG 2015-05-11 21:04:22,608 (qtp380224087-322) - http-outgoing-0 >> Host:
localhost:8080
DEBUG 2015-05-11 21:04:22,608 (qtp380224087-322) - http-outgoing-0 >>
Connection: Keep-Alive
DEBUG 2015-05-11 21:04:22,608 (qtp380224087-322) - http-outgoing-0 >>
User-Agent: Apache-HttpClient/4.3.5 (java 1.5)
DEBUG 2015-05-11 21:04:22,608 (qtp380224087-322) - http-outgoing-0 >>
Accept-Encoding: gzip,deflate
DEBUG 2015-05-11 21:04:22,608 (qtp380224087-322) - http-outgoing-0 >> "GET
/alfresco/service/api/node/auth/resolve/admin HTTP/1.1[\r][\n]"
DEBUG 2015-05-11 21:04:22,608 (qtp380224087-322) - http-outgoing-0 >>
"Accept: application/json[\r][\n]"
DEBUG 2015-05-11 21:04:22,608 (qtp380224087-322) - http-outgoing-0 >>
"Authorization: Basic YWRtaW46YWRtaW4=[\r][\n]"
DEBUG 2015-05-11 21:04:22,608 (qtp380224087-322) - http-outgoing-0 >>
"Host: localhost:8080[\r][\n]"
DEBUG 2015-05-11 21:04:22,608 (qtp380224087-322) - http-outgoing-0 >>
"Connection: Keep-Alive[\r][\n]"
DEBUG 2015-05-11 21:04:22,608 (qtp380224087-322) - http-outgoing-0 >>
"User-Agent: Apache-HttpClient/4.3.5 (java 1.5)[\r][\n]"
DEBUG 2015-05-11 21:04:22,608 (qtp380224087-322) - http-outgoing-0 >>
"Accept-Encoding: gzip,deflate[\r][\n]"
DEBUG 2015-05-11 21:04:22,608 (qtp380224087-322) - http-outgoing-0 >>
"[\r][\n]"
DEBUG 2015-05-11 21:04:22,661 (qtp380224087-322) - http-outgoing-0 <<
"HTTP/1.1 404 Not Found[\r][\n]"
DEBUG 2015-05-11 21:04:22,661 (qtp380224087-322) - http-outgoing-0 <<
"Server: Apache-Coyote/1.1[\r][\n]"
DEBUG 2015-05-11 21:04:22,661 (qtp380224087-322) - http-outgoing-0 <<
"Cache-Control: no-cache[\r][\n]"
DEBUG 2015-05-11 21:04:22,661 (qtp380224087-322) - http-outgoing-0 <<
"Expires: Thu, 01 Jan 1970 00:00:00 GMT[\r][\n]"
DEBUG 2015-05-11 21:04:22,661 (qtp380224087-322) - http-outgoing-0 <<
"Pragma: no-cache[\r][\n]"
DEBUG 2015-05-11 21:04:22,661 (qtp380224087-322) - http-outgoing-0 <<
"Content-Type: text/html;charset=UTF-8[\r][\n]"
DEBUG 2015-05-11 21:04:22,661 (qtp380224087-322) - http-outgoing-0 <<
"Transfer-Encoding: chunked[\r][\n]"
DEBUG 2015-05-11 21:04:22,661 (qtp380224087-322) - http-outgoing-0 <<
"Date: Tue, 12 May 2015 02:04:22 GMT[\r][\n]"
DEBUG 2015-05-11 21:04:22,661 (qtp380224087-322) - http-outgoing-0 <<
"[\r][\n]"
DEBUG 2015-05-11 21:04:22,661 (qtp380224087-322) - http-outgoing-0 <<
"630[\r][\n]"
DEBUG 2015-05-11 21:04:22,661 (qtp380224087-322) - http-outgoing-0 <<
"<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "
http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">[\n]"
DEBUG 2015-05-11 21:04:22,661 (qtp380224087-322) - http-outgoing-0 <<
"<html xmlns="http://www.w3.org/1999/xhtml">[\n]"
DEBUG 2015-05-11 21:04:22,661 (qtp380224087-322) - http-outgoing-0 << "
<head>[\n]"
DEBUG 2015-05-11 21:04:22,661 (qtp380224087-322) - http-outgoing-0 <<
"      <title>Web Script Status 404 - Not Found</title>[\n]"
DEBUG 2015-05-11 21:04:22,661 (qtp380224087-322) - http-outgoing-0 <<
"      <link rel="stylesheet" href="/alfresco/css/webscripts.css"
type="text/css" />[\n]"
DEBUG 2015-05-11 21:04:22,661 (qtp380224087-322) - http-outgoing-0 << "
</head>[\n]"
DEBUG 2015-05-11 21:04:22,662 (qtp380224087-322) - http-outgoing-0 << "
<body>[\n]"
DEBUG 2015-05-11 21:04:22,662 (qtp380224087-322) - http-outgoing-0 <<
"      <div>[\n]"
DEBUG 2015-05-11 21:04:22,662 (qtp380224087-322) - http-outgoing-0 <<
"         <table>[\n]"
DEBUG 2015-05-11 21:04:22,662 (qtp380224087-322) - http-outgoing-0 <<
"            <tr>[\n]"
DEBUG 2015-05-11 21:04:22,662 (qtp380224087-322) - http-outgoing-0 <<
"               <td><img src="/alfresco/images/logo/AlfrescoLogo32.png"
alt="Alfresco" /></td>[\n]"
DEBUG 2015-05-11 21:04:22,662 (qtp380224087-322) - http-outgoing-0 <<
"               <td><span class="title">Web Script Status 404 - Not
Found</span></td>[\n]"
DEBUG 2015-05-11 21:04:22,662 (qtp380224087-322) - http-outgoing-0 <<
"            </tr>[\n]"
DEBUG 2015-05-11 21:04:22,662 (qtp380224087-322) - http-outgoing-0 <<
"         </table>[\n]"
DEBUG 2015-05-11 21:04:22,662 (qtp380224087-322) - http-outgoing-0 <<
"         <br/>[\n]"
DEBUG 2015-05-11 21:04:22,662 (qtp380224087-322) - http-outgoing-0 <<
"         <table>[\n]"
DEBUG 2015-05-11 21:04:22,662 (qtp380224087-322) - http-outgoing-0 <<
"            <tr><td>The Web Script <a
href="%2Falfresco%2Fservice%2Fapi%2Fnode%2Fauth%2Fresolve%2Fadmin">/alfresco/service/api/node/auth/resolve/admin</a>
has responded with a status of 404 - Not Found.</td></tr>[\n]"
DEBUG 2015-05-11 21:04:22,662 (qtp380224087-322) - http-outgoing-0 <<
"         </table>[\n]"
DEBUG 2015-05-11 21:04:22,662 (qtp380224087-322) - http-outgoing-0 <<
"         <br/>[\n]"
DEBUG 2015-05-11 21:04:22,662 (qtp380224087-322) - http-outgoing-0 <<
"         <table>[\n]"
DEBUG 2015-05-11 21:04:22,662 (qtp380224087-322) - http-outgoing-0 <<
"            <tr><td><b>404 Description:</b></td><td>
Requested resource is
not available.</td></tr>[\n]"





****** I tried manually invoking:
http://localhost:8080/alfresco/service/auth/resolve/admin.  It asked for
login/pswd and then returned:
[
  {
    "username" : "admin",
    "authorities" : [
        "GROUP_ALFRESCO_ADMINISTRATORS"
        ,
        "GROUP_EMAIL_CONTRIBUTORS"
        ,
        "GROUP_EVERYONE"
        ,
        "GROUP_site_swsdp"
        ,
        "GROUP_site_swsdp_SiteManager"
        ,
        "ROLE_ADMINISTRATOR"

    ]
  }

]



************ text from the http://localhost:8080/alfresco/service request
Clear dependency cachesPOST /alfresco/service/caches/dependency/clear
<http://localhost:8080/alfresco/service/caches/dependency/clear>---Clears
all the caches from the various configured dependency handlers.---
Authentication:adminTransaction:requiredFormat Style:anyDefault Format:html
Lifecycle:internal
Id:caching/clearDependencies.post
<http://localhost:8080/alfresco/service/script/caching/clearDependencies.post>
Descriptor:classpath:webscripts/caching/clearDependencies.post.desc.xml
<http://localhost:8080/alfresco/service/description/caching/clearDependencies.post>
Package: /com/github/maoo/indexer/webscripts
<http://localhost:8080/alfresco/service/index/package/com/github/maoo/indexer/webscripts>


Node ActionsGET
/alfresco/service/node/actions/{storeProtocol}/{storeId}/{uuid}
<http://localhost:8080/alfresco/service/node/actions/%7BstoreProtocol%7D/%7BstoreId%7D/%7Buuid%7D>
---Node Actions---Authentication:userTransaction:requiredFormat Style:
argumentDefault Format:json
Id:com/github/maoo/indexer/webscripts/actions.get
<http://localhost:8080/alfresco/service/script/com/github/maoo/indexer/webscripts/actions.get>
Descriptor:
classpath:alfresco/extension/templates/webscripts/com/github/maoo/indexer/webscripts/actions.get.desc.xml
<http://localhost:8080/alfresco/service/description/com/github/maoo/indexer/webscripts/actions.get>
Authority ResolveGET /alfresco/service/auth/resolve/{username}
<http://localhost:8080/alfresco/service/auth/resolve/%7Busername%7D>---Renders
out all authorities related with the given user(name)---Authentication:user
Transaction:requiredFormat Style:argumentDefault Format:json
Id:com/github/maoo/indexer/webscripts/authresolve.get
<http://localhost:8080/alfresco/service/script/com/github/maoo/indexer/webscripts/authresolve.get>
Descriptor:
classpath:alfresco/extension/templates/webscripts/com/github/maoo/indexer/webscripts/authresolve.get.desc.xml
<http://localhost:8080/alfresco/service/description/com/github/maoo/indexer/webscripts/authresolve.get>
Node ChangesGET
/alfresco/service/node/changes/{storeProtocol}/{storeId}?lastTxnId={lastTxnId?}&lastAclChangesetId=${lastAclChangesetId}&indexingFilters=${indexingFilters?}&maxTxns=${maxTxns?}&maxAclChangesets=${maxAclChangesets?}
<http://localhost:8080/alfresco/service/node/changes/%7BstoreProtocol%7D/%7BstoreId%7D?lastTxnId=%7BlastTxnId?%7D&lastAclChangesetId=$%7BlastAclChangesetId%7D&indexingFilters=$%7BindexingFilters?%7D&maxTxns=$%7BmaxTxns?%7D&maxAclChangesets=$%7BmaxAclChangesets?%7D>
---Node Changes---Authentication:userTransaction:requiredFormat Style:
argumentDefault Format:json
Id:com/github/maoo/indexer/webscripts/changes.get
<http://localhost:8080/alfresco/service/script/com/github/maoo/indexer/webscripts/changes.get>
Descriptor:
classpath:alfresco/extension/templates/webscripts/com/github/maoo/indexer/webscripts/changes.get.desc.xml
<http://localhost:8080/alfresco/service/description/com/github/maoo/indexer/webscripts/changes.get>
Node DetailsGET
/alfresco/service/node/details/{storeProtocol}/{storeId}/{uuid}
<http://localhost:8080/alfresco/service/node/details/%7BstoreProtocol%7D/%7BstoreId%7D/%7Buuid%7D>
---Node Details, including list of authorities with READ access on the node
---Authentication:userTransaction:requiredFormat Style:argumentDefault
Format:json
Id:com/github/maoo/indexer/webscripts/details.get
<http://localhost:8080/alfresco/service/script/com/github/maoo/indexer/webscripts/details.get>
Descriptor:
classpath:alfresco/extension/templates/webscripts/com/github/maoo/indexer/webscripts/details.get.desc.xml
<http://localhost:8080/alfresco/service/description/com/github/maoo/indexer/webscripts/details.get>


On Mon, May 11, 2015 at 9:16 AM, Maurizio Pillitu <maurizio@session.it>
wrote:

> Hi Deanna,
> sorry for the late reply.
>
> The source code of the AMP can be found at
> https://github.com/maoo/alfresco-indexer ; my first advise would be to
> check if the new webscripts are accessible on Alfresco; you can access via
> http://localhost:8090/alfresco/service and "browse all webscripts".
>
> If you find the Alfresco Indexer webscripts, you can try to invoke them
> (for example,
> http://localhost:8090/alfresco/service/node/changes/workspace/SpacesStore)
>
> If this works, it means Alfresco Indexer is responding correctly,
> therefore the issue lies on the Manifold side; as soon as you validate the
> mentioned steps, we can move forward with the debugging.
>
> Thanks,
>   mao
>
>
> On Mon, May 11, 2015 at 3:43 PM Karl Wright <daddywri@gmail.com> wrote:
>
>> Hi Deanna,
>>
>> I have contacted the author of the plugin, who works for Alfresco.  In
>> ManifoldCF we distribute only the AMP binary, so Maurizio would be the
>> right guy to answer any source questions.
>>
>> Thanks,
>> Karl
>>
>>
>> On Mon, May 11, 2015 at 9:27 AM, Delapasse, Deanna <
>> ddelapasse@oceaneering.com> wrote:
>>
>>> The Alfresco Webscripts connector requires an AMP installed into the
>>> Alfresco server to provide the webscripts the connector calls.  The
>>> connector's author pointed me to his GitHub source code, but it isn't
>>> working for me as-is (installs ok, but the included webscripts aren't
>>> accessible).  Are the AMP sources available from MCF?  And do you know the
>>> last Alfresco version that anyone used it with? Possibly I will need to
>>> tweak it to work with my Alfresco 4.2.f.
>>>
>>> thanks!
>>> Deanna
>>>
>>>
>>> On Wed, May 6, 2015 at 11:19 AM, Karl Wright <daddywri@gmail.com> wrote:
>>>
>>>> Here's the key finding:
>>>>
>>>> "Ok, the problem is because you only get to write the seeding query.
>>>> The
>>>>
>>>> query that fetches individual documents is hardwired.  I believe it is set
>>>> in opencmis in fact."
>>>>
>>>>
>>>> So basically, for the CMIS connector, you aren't writing the query that finds
the document data and metadata; you are writing the query that finds the set of documents
to index.  And the query you *need* to modify is in fact baked into some jar in Apache Chemistry,
which greatly limits the CMIS connector's utility for indexing metadata.
>>>>
>>>>
>>>> Is there any way you can use one of the two the native Alfresco connectors
we supply?
>>>>
>>>>
>>>> Karl
>>>>
>>>>
>>>>
>>>> On Wed, May 6, 2015 at 12:10 PM, Karl Wright <daddywri@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi Deanna,
>>>>>
>>>>> I vaguely recall that Apache Chemistry (which the CMIS connector
>>>>> relies on) running against Alfresco has some limitations where metadata
is
>>>>> concerned.  I'm pretty sure there was an email exchange posted somewhere,
>>>>> so you might be able to dig it up here:
>>>>>
>>>>> http://www.mail-archive.com/user@manifoldcf.apache.org/index.html
>>>>>
>>>>> I'll look around and see.
>>>>>
>>>>> The other potential problem is your ElasticSearch configuration.  I
>>>>> don't know a lot about this myself.  I think it makes sense to try to
>>>>> figure out on which end the problem lies; if you can see in some log
what
>>>>> actually gets posted to ElasticSearch for each document, that would help.
>>>>>
>>>>> Karl
>>>>>
>>>>>
>>>>> On Wed, May 6, 2015 at 11:42 AM, Delapasse, Deanna <
>>>>> ddelapasse@oceaneering.com> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I'm trying to use ManifoldCF to crawl my Alfresco repo (via the CMIS
>>>>>> connector) and push the results into ElasticSearch.  My users want
to
>>>>>> search metadata (including custom) and content. I followed some tutorials
>>>>>> and got it running quickly BUT...regardless of my ElasticSearch mapping
the
>>>>>> only CMIS metadata entity I can find in my indexed results is cmis:objectId.
>>>>>>
>>>>>> I have tried using various cmis queries (with 'select * ...' and
with
>>>>>> 'select cmis:name, cmis:lastModifiedBy, ...'.  I have verified my
queries
>>>>>> and they definitely return metadata, but the data doesn't appear
in
>>>>>> ElasticSearch.   I tried a simple attachment mapping and also a mapping
>>>>>> where I specifically list some of the cmis properties. Regardless
of
>>>>>> mapping, my indexes look like this:
>>>>>>
>>>>>>
>>>>>> {
>>>>>>    "_index":"test",
>>>>>>    "_type":"file",
>>>>>>    "_id":"
>>>>>> http://localhost:8080/alfresco/api/-default-/public/cmis/versions/1.1/atom/content/10.0.txt?id=2555a540-a5b3-4c27-90f6-c89b6742bd4f%3B1.0
>>>>>> ",
>>>>>>    "_version":2,
>>>>>>    "_score":1,
>>>>>>    "_source":{
>>>>>>       "cmis:objectId":"2555a540-a5b3-4c27-90f6-c89b6742bd4f;1.0",
>>>>>>       "allow_token_document":"__nosecurity__",
>>>>>>       "deny_token_document":"__nosecurity__",
>>>>>>       "allow_token_share":"__nosecurity__",
>>>>>>       "deny_token_share":"__nosecurity__",
>>>>>>       "allow_token_parent":"__nosecurity__",
>>>>>>       "deny_token_parent":"__nosecurity__",
>>>>>>       "file":{
>>>>>>          "_content_type":"text/plain",
>>>>>>          "_name":"10.0.txt",
>>>>>>          "_content":"DQpJIGFtIGFuIEFsZnJlc2NvIGZpbGUuDQo="
>>>>>>       }
>>>>>>    }
>>>>>> }
>>>>>>
>>>>>> The ES results are good and I can search perfectly by content &
>>>>>> cmis:objectId.  I have enabled debugging and no errors appear in
the log.  *What
>>>>>> do I have to DO to get cmis:name, cmis:lastModifiedBy and other properties
>>>>>> to appear?*
>>>>>>
>>>>>> Thanks in advance!  This product is very simple to use and has
>>>>>> potential to be a HUGE help to us!!!
>>>>>>
>>>>>> Deanna
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>

Mime
View raw message