tomee-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Romain Manni-Bucau <rmannibu...@gmail.com>
Subject Re: CXF and WebSocket messages are merged together
Date Thu, 16 Jun 2016 07:14:56 GMT
Hmm, you should have onDisconnect() or onThrowable() hook called. Question
is: is it an atmosphere bug or was tomcat missing an event (there was a
rare case where onClose() of websocket can be missed but atmosphere should
protect itself against it wrapping the calls to ensure it detects it lazily
too.


Romain Manni-Bucau
@rmannibucau <https://twitter.com/rmannibucau> |  Blog
<https://blog-rmannibucau.rhcloud.com> | Old Wordpress Blog
<http://rmannibucau.wordpress.com> | Github <https://github.com/rmannibucau> |
LinkedIn <https://www.linkedin.com/in/rmannibucau> | Tomitriber
<http://www.tomitribe.com> | JavaEE Factory
<https://javaeefactory-rmannibucau.rhcloud.com>

2016-06-16 9:05 GMT+02:00 Ihsan Ecemis <miecemis@gmail.com>:

>
> > On Jun 16, 2016, at 2:37 AM, Romain Manni-Bucau <rmannibucau@gmail.com>
> wrote:
> >
> > 2016-06-16 4:09 GMT+02:00 Ihsan Ecemis <miecemis@gmail.com <mailto:
> miecemis@gmail.com>>:
> >
> >>
> >> Thanks for your answer Romain.  I would appreciate any links to inspire
> me
> >> how to do that, especially for Twilio API calls.  I don’t know how to
> use
> >> sessionIds to track connections.
> >>
> >>
> > Idea was to use websocket ot http session, I don't know if you activated
> it
> > (atmosphere can need session support for long polling).
>
> We have this in our web.xml:
>
>   <listener>
>     <listener-class>org.atmosphere.cpr.SessionSupport</listener-class>
>   </listener>
>
>   <context-param>
>     <param-name>org.atmosphere.cpr.sessionSupport</param-name>
>     <param-value>true</param-value>
>   </context-param>
>
> as documented here:
> https://github.com/Atmosphere/atmosphere/wiki/Enabling-HttpSession-Support
> <
> https://github.com/Atmosphere/atmosphere/wiki/Enabling-HttpSession-Support
> >
>
> And we can get sessionId with AtmosphereResource.session().
>
> But apparently since Atmosphere doesn’t notify us about the disconnect we
> can still write to the underlying resource, which is not cleared when
> another producer starts using it.
>
> BTW, we are using @ManagedService (
> https://github.com/Atmosphere/atmosphere/wiki/Getting-Started-with-the-ManagedService-Annotation#managedservice
> <
> https://github.com/Atmosphere/atmosphere/wiki/Getting-Started-with-the-ManagedService-Annotation#managedservice>)
> which comes with HeartbeatInterceptor, so I don’t understand how the Server
> doesn’t detect the disconnect for 10 minutes...
>
>
> >>
> >> Here are 2 new pieces of information:
> >>
> >> (1) WebSocket connections were shaky for some users and Atmosphere
> >> javascript client had a failover to long-polling.
> >>
> >> So yesterday, we disabled long-polling failover and since then we didn't
> >> have any errors from Twilio.  Probably it is too early to claim victory
> but
> >> I think it is an interesting observation.
> >>
> >> If we don’t see any other errors, I will think that the client was
> >> switching from websocket to long-polling, and then getting disconnected.
> >> The server (Atmosphere?) didn’t realize this disconnect and kept on
> writing
> >> to the same connection. And at some point that same connection was
> handed
> >> over to the Twilio API call…  (which should not happen!  must be
> prevented
> >> by Tomcat?  nginx?  CXF?  Atmosphere?)
> >>
> >> I don’t know how the Socket Connection Pool works at low-level so this
> is
> >> only my speculation, both from my observation there is something really
> >> wrong here.
> >>
> >>
> >> (2) I looked at Twilio error messages more carefully,  and each error I
> >> looked so far contained WebSockets messages addressed to the same user
> >> (This is very manual process, I check our logs for each task_id posted
> to
> >> Twilio in JSON).
> >>
> >> In one case there were 5 WebSockets messages addressed to the same user,
> >> prepended to Twilio XML.   These 5 messages spanned 576 seconds, i.e.
> >> almost 10 minutes!  (Each WebSockets messages has a  timestamp).   So we
> >> kept on writing to the same connection for 10 minutes after the
> WebSocket
> >> was disconnected   ;-(
> >>
> >>
> > Can it be a bad mobile connection?
>
> The user who had lots of disconnects the other day was working in a dorm
> room over WiFi.  Not mobile, but not great connection either.
>
> Maybe the user was continuously connecting and disconnecting,  and
> Atmosphere was trying to keep the underlying resource alive?
>
>
>
>
> >>
> >>
> >>> On Jun 15, 2016, at 7:26 PM, Romain Manni-Bucau <rmannibucau@gmail.com
> >
> >> wrote:
> >>>
> >>> Is debugging using the sessionId to track connections an option?
> >>> Le 16 juin 2016 01:12, "Ihsan Ecemis" <miecemis@gmail.com> a écrit
:
> >>>
> >>>>
> >>>> TL;DR:  We are having this weird, non-reproducable problem in a
> complex
> >>>> system with lots of moving parts. I posted this question to Atmosphere
> >>>> mailing list but didn’t get useful answers.  I would appreciate if
> >> anyone
> >>>> can tell me whether there is any chance what we observe can be a bug
> >> with
> >>>> Tomcat/TomEE.  Any other help/tips would be greatly appreciated.
> Thanks
> >>>>
> >>>>
> >>>> Problem:
> >>>>
> >>>> We experience some weird problems with WebSockets under Atmosphere,
> >>>> running under TomEE 7.0.0 — the whole stack is listed below.
> >>>>
> >>>>
> >>>> One of our JAX-RS API endpoints (something like /twilio/xxxx) is
> called
> >> by
> >>>> Twilio.  Twilio expects  the following XML response:
> >>>>
> >>>> <?xml version="1.0" encoding="UTF-8" standalone="yes"?><Response/>
> >>>>
> >>>> That system was working fine for months.  We were receiving Text
> >> Messages
> >>>> with no errors from Twilio.
> >>>>
> >>>>
> >>>> Then, we added WebSockets at a different endpoint (something like
> >>>> /ws/yyyy).  Using our AngularJS app, our users started communicating
> >> with
> >>>> our backend in JSON.  A sample message our backend wrote to a
> WebSocket
> >>>> client:
> >>>>
> >>>>
> >>>>
> >>
> {"thread_id":"Server-2702","session_id":null,"command":"cancel_caleasy_item","data":{"text_message_id":1126,"task_id":10134},"time":1465409434004}
> >>>>
> >>>>
> >>>>
> >>>> After a while, Twilio started sending us "Schema validation warning”
> >>>> emails.  When I went to Twilio Dashboard,  I saw that they received
> the
> >>>> following response to their API call:
> >>>>
> >>>>
> >>
> {"thread_id":”Server-1732","session_id":null,"command":"cancel_caleasy_item","data":{"text_message_id”:734,"task_id”:9345},"time":
> >>>> 1465409122781}<?xml version="1.0" encoding="UTF-8"
> >>>> standalone="yes"?><Response/>
> >>>>
> >>>>
> >>>> So Twilio received a message that was meant to be sent to a WebSocket
> >>>> user, prepended to the usual Twilio response!  This looks crazy, and
> not
> >>>> unacceptable of course.
> >>>>
> >>>> This happens a few times, sometimes a dozen times a day.  We put
> >> together
> >>>> a test system but could not reproduce the problem. (Our test system
> was
> >> not
> >>>> exactly the same system as our production in many respects
> >> unfortunately)
> >>>>
> >>>> Sometimes Twilio receives more than one WebSocket message, again an
> >> actual
> >>>> example (redacted just a little for brevity):
> >>>>
> >>>>
> >>
> {"thread_id":"Server-2722","session_id":null,"command":"cancel_caleasy_item","data":{"text_message_id":1966,"task_id":11112},"time":1465409571004}{"thread_id":"Server-2723","session_id":null,"command":"new_caleasy_item","data":{"text_message_id":1453,"text_message_created_at":null,"meal_type_id":6,"body":"Edamame","report_date":"2016-06-03","caleasy_media_list":[],"task_id":11114,"expiration_millis":180000},"time":1465409582010}<?xml
> >>>> version="1.0" encoding="UTF-8" standalone="yes"?><Response/>
> >>>>
> >>>>
> >>>> In this example, our Backend wrote 2 WebSocket messages, probably
> from 2
> >>>> separate threads, intended for 2 separate WebSocket users, and then
it
> >>>> wrote the response to Twilio.  It looks like we are writing to a
> >> connection
> >>>> nobody is reading from, and appending more messages to it, and then
> the
> >>>> whole thing is returned to Twilio when they make the API call…
> >>>>
> >>>>
> >>>> I don’t understand how this could happening.  In nginx?  (I really
> doubt
> >>>> nginx can have such a huge bug)   In Tomcat?   A problem between nginx
> >> and
> >>>> Tomcat that only affects WebSockets when regular API calls are also
> >> used?
> >>>> I have no idea and I don’t know where to start to investigate   ;-(
> >>>>
> >>>>
> >>>> Here is our stack:
> >>>>
> >>>> *) AWS Linux 4.1.10-17.31, TomEE 7.0.0, nginx as proxy server on the
> >> same
> >>>> AWS instance
> >>>> *) SSL ends on nginx. nginx is configured for WebSockets following
> >> online
> >>>> instructions.
> >>>> *) We serve JAX-RS API calls via Apache CXF (version 3.1.6), using
> >>>> standard @Path("/....") annotations.
> >>>> *) We have just one WebSocket service using Atmosphere (version
> 2.4.2),
> >>>> defined as @ManagedService(path = "/caleasy”)
> >>>> *) WebSocket service is used by our AngularJS based UI
> >>>> *) No other specific setup with CXF and Atmosphere
> >>>>
> >>>> In our web.xml, we configure Atmosphere as follows:
> >>>>
> >>>> <servlet>
> >>>>   <description>AtmosphereServlet</description>
> >>>>   <servlet-name>AtmosphereServlet</servlet-name>
> >>>>   <servlet-class>org.atmosphere.cpr.AtmosphereServlet</servlet-class>
> >>>> </servlet>
> >>>>
> >>>> <servlet-mapping>
> >>>>   <servlet-name>AtmosphereServlet</servlet-name>
> >>>>   <url-pattern>/ws/*</url-pattern>
> >>>> </servlet-mapping>
> >>>>
> >>>> <listener>
> >>>>   <listener-class>org.atmosphere.cpr.SessionSupport</listener-class>
> >>>> </listener>
> >>>>
> >>>>
> >>>>
> >>>> PS: Our TomEE logs have the following Exceptions, but I am not sure
> >>>> whether they are normal or suspicious:
> >>>>
> >>>> 2016-06-09 13:52:41,832 ERROR [nio-8080-exec-9] o.a.c.JSR356Endpoint
> >>>>  -
> >>>> java.io.EOFException: null
> >>>>       at
> >>>>
> >>
> org.apache.tomcat.util.net.NioEndpoint$NioSocketWrapper.fillReadBuffer(NioEndpoint.java:1267)
> >>>>       at
> >>>>
> >>
> org.apache.tomcat.util.net.NioEndpoint$NioSocketWrapper.isReadyForRead(NioEndpoint.java:1176)
> >>>>       at
> >>>>
> >>
> org.apache.tomcat.websocket.server.WsFrameServer.onDataAvailable(WsFrameServer.java:58)
> >>>>       at
> >>>>
> >>
> org.apache.tomcat.websocket.server.WsHttpUpgradeHandler.upgradeDispatch(WsHttpUpgradeHandler.java:148)
> >>>>       at
> >>>>
> >>
> org.apache.coyote.http11.upgrade.UpgradeProcessorInternal.dispatch(UpgradeProcessorInternal.java:54)
> >>>>       at
> >>>>
> >>
> org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:53)
> >>>>       at
> >>>>
> >>
> org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:788)
> >>>>       at
> >>>>
> >>
> org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:1485)
> >>>>       at
> >>>>
> >>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> >>>>       at
> >>>>
> >>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> >>>>       at
> >>>>
> >>
> org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
> >>>>       at java.lang.Thread.run(Thread.java:745)
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message