aries-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Weirig (JIRA)" <>
Subject [jira] [Commented] (ARIES-1804) Timeout due to connection loss in RSA fastbin provider?
Date Sat, 02 Jun 2018 13:26:10 GMT


Alex Weirig commented on ARIES-1804:

I think I finally found the source of the problem...

My LDAP service that does the LDAP processing is using the Apache LDAP API. I'm creating a
pool to get the LDAP connections. When I'm done with the LDAP connection I'm unbinding it
and closing it ... but it seems there is a need to release the LDAP connection back to the

The default value of LDAP connections in the pool is 8, so after 8 calls, no LDAP connection
was left in the pool. From what I see, there is no stacktrace being produced and if I read
correctly, the default wait time for an LDAP connection is infinite.

I was able to reproduce the problem by calling my scheduled job 8 times. I've now added a
release connection before I unbind and close it and so far I haven't seen the problem show

I'll let my job run over the weekend but I'm confident that's the cause of the problem and
obviously not karaf, zookeeper or fastbin. The timeout that occurred further on up the road
was just a consequence of the LDAP API not receiving a connection from the pool and sitting
there and waiting for ever. I will close the ticket on Monday if no surprise shows up over
the weekend.

I suppose I will fine-tune the default values with a little more attention.

Thank you very much for your support


> Timeout due to connection loss in RSA fastbin provider?
> -------------------------------------------------------
>                 Key: ARIES-1804
>                 URL:
>             Project: Aries
>          Issue Type: Bug
>          Components: Remote Service Admin
>    Affects Versions: rsa-1.12.0
>         Environment: Karaf 4.2.0
> RSA 1.12.0
> zookeeper 3.4.12
> java 1.8.0_172-b11
> RHEL 7.5
>            Reporter: Alex Weirig
>            Priority: Critical
>         Attachments: AuthenticationServiceImpl.txt, LoginView.txt, stacktrace.txt, zoo.cfg.txt
> Hello,
> I'm running two karaf (4.2.0) servers, one is running the frontend of my application,
the second one is running the backend.
> The backend services are published to 3 clustered zookeeper (3.4.12) servers. In karaf
I have deployed the following RSA features:
> karaf@appsrvtlk()> feature:list | grep rsa
> aries-rsa-core │ 1.12.0 │ │ Started │ aries-rsa-1.12.0 │
> aries-rsa-provider-tcp │ 1.12.0 │ │ Uninstalled │ aries-rsa-1.12.0 │
> aries-rsa-provider-fastbin │ 1.12.0 │ x │ Started │ aries-rsa-1.12.0 │
> aries-rsa-discovery-local │ 1.12.0 │ │ Uninstalled │ aries-rsa-1.12.0 │
> aries-rsa-discovery-config │ 1.12.0 │ │ Uninstalled │ aries-rsa-1.12.0 │
> aries-rsa-discovery-zookeeper │ 1.12.0 │ x │ Started │ aries-rsa-1.12.0 │
> aries-rsa-discovery-zookeeper-server │ 1.12.0 │ │ Uninstalled │ aries-rsa-1.12.0
> When I start my karaf servers everything is working fine and my frontend can call my
backend service and gets the result. But after some time (I can't figure out when) it seems
that the connections between the karaf and zookeeper gets lost and I'm getting a timeout when
I call my remote service eventhough all the servers (karaf and zookeepers) are still available
and responding. Exhibitor shows no apparent issues with the zookeepers.
> I have attached the 
>  * relevant parts of my LoginView UI where I declared the @Reference to my service and
where I call the remote service
>  * relevant parts of my AuthenticationService implementation that should be called on
the remote karaf
>  * the stacktrace that I'm getting on the frontend karaf when the timeout occurs
>  * my zoo.cfg file
> From the stacktrace one can see that the LoginView has a non-null fastbin proxy handler
for the authentication service but that after 5 minutes a timeout occurs and there is no line
in the log that shows that the remote service was actually called.
> Many thanks in advance for your support.
> Kind regards,
> Alex

This message was sent by Atlassian JIRA

View raw message