drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sudheesh Katkam <skat...@maprtech.com>
Subject Re: [Drill 1.9.0] : [CONNECTION ERROR] :- (user client) closed unexpectedly. Drillbit down?
Date Wed, 21 Dec 2016 18:43:12 GMT
Two more questions..

(1) How many nodes in your cluster?
(2) How many queries are running when the failure is seen?

If you have multiple large queries running at the same time, the load on the system could
cause those failures (which are heartbeat related).

The two options I suggested decrease the parallelism of stages in a query, this implies lesser
load but slower execution.

System level option affect all queries, and session level affect queries on a specific connection.
Not sure what is preferred in your environment.

Also, you may be interested in metrics. More info here:

http://drill.apache.org/docs/monitoring-metrics/ <http://drill.apache.org/docs/monitoring-metrics/>

Thank you,
Sudheesh

> On Dec 21, 2016, at 4:31 AM, Anup Tiwari <anup.tiwari@games24x7.com> wrote:
> 
> @sudheesh, yes drill bit is running on datanodeN/10.*.*.5:31010).
> 
> Can you tell me how this will impact to query and do i have to set this at
> session level OR system level?
> 
> 
> 
> Regards,
> *Anup Tiwari*
> 
> On Tue, Dec 20, 2016 at 11:59 PM, Chun Chang <cchang@maprtech.com> wrote:
> 
>> I am pretty sure this is the same as DRILL-4708.
>> 
>> On Tue, Dec 20, 2016 at 10:27 AM, Sudheesh Katkam <skatkam@maprtech.com>
>> wrote:
>> 
>>> Is the drillbit service (running on datanodeN/10.*.*.5:31010) actually
>>> down when the error is seen?
>>> 
>>> If not, try lowering parallelism using these two session options, before
>>> running the queries:
>>> 
>>> planner.width.max_per_node (decrease this)
>>> planner.slice_target (increase this)
>>> 
>>> Thank you,
>>> Sudheesh
>>> 
>>>> On Dec 20, 2016, at 12:28 AM, Anup Tiwari <anup.tiwari@games24x7.com>
>>> wrote:
>>>> 
>>>> Hi Team,
>>>> 
>>>> We are running some drill automation script on a daily basis and we
>> often
>>>> see that some query gets failed frequently by giving below error ,
>> Also i
>>>> came across DRILL-4708 <https://issues.apache.org/
>> jira/browse/DRILL-4708
>>>> 
>>>> which seems similar, Can anyone give me update on that OR workaround to
>>>> avoid such issue ?
>>>> 
>>>> *Stack Trace :-*
>>>> 
>>>> Error: CONNECTION ERROR: Connection /10.*.*.1:41613 <-->
>>>> datanodeN/10.*.*.5:31010 (user client) closed unexpectedly. Drillbit
>>> down?
>>>> 
>>>> 
>>>> [Error Id: 5089f2f1-0dfd-40f8-9fa0-8276c08be53f ] (state=,code=0)
>>>> java.sql.SQLException: CONNECTION ERROR: Connection /10.*.*.1:41613
>> <-->
>>>> datanodeN/10.*.*.5:31010 (user client) closed unexpectedly. Drillb
>>>> it down?
>>>> 
>>>> 
>>>> [Error Id: 5089f2f1-0dfd-40f8-9fa0-8276c08be53f ]
>>>>       at
>>>> org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(
>>> DrillCursor.java:232)
>>>>       at
>>>> org.apache.drill.jdbc.impl.DrillCursor.loadInitialSchema(
>>> DrillCursor.java:275)
>>>>       at
>>>> org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(
>>> DrillResultSetImpl.java:1943)
>>>>       at
>>>> org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(
>>> DrillResultSetImpl.java:76)
>>>>       at
>>>> org.apache.calcite.avatica.AvaticaConnection$1.execute(
>>> AvaticaConnection.java:473)
>>>>       at
>>>> org.apache.drill.jdbc.impl.DrillMetaImpl.prepareAndExecute(
>>> DrillMetaImpl.java:465)
>>>>       at
>>>> org.apache.calcite.avatica.AvaticaConnection.
>> prepareAndExecuteInternal(
>>> AvaticaConnection.java:477)
>>>>       at
>>>> org.apache.drill.jdbc.impl.DrillConnectionImpl.
>>> prepareAndExecuteInternal(DrillConnectionImpl.java:169)
>>>>       at
>>>> org.apache.calcite.avatica.AvaticaStatement.executeInternal(
>>> AvaticaStatement.java:109)
>>>>       at
>>>> org.apache.calcite.avatica.AvaticaStatement.execute(
>>> AvaticaStatement.java:121)
>>>>       at
>>>> org.apache.drill.jdbc.impl.DrillStatementImpl.execute(
>>> DrillStatementImpl.java:101)
>>>>       at sqlline.Commands.execute(Commands.java:841)
>>>>       at sqlline.Commands.sql(Commands.java:751)
>>>>       at sqlline.SqlLine.dispatch(SqlLine.java:746)
>>>>       at sqlline.SqlLine.runCommands(SqlLine.java:1651)
>>>>       at sqlline.Commands.run(Commands.java:1304)
>>>>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>       at
>>>> sun.reflect.NativeMethodAccessorImpl.invoke(
>>> NativeMethodAccessorImpl.java:62)
>>>>       at
>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(
>>> DelegatingMethodAccessorImpl.java:43)
>>>>       at java.lang.reflect.Method.invoke(Method.java:498)
>>>>       at
>>>> sqlline.ReflectiveCommandHandler.execute(
>> ReflectiveCommandHandler.java:
>>> 36)
>>>>       at sqlline.SqlLine.dispatch(SqlLine.java:742)
>>>>       at sqlline.SqlLine.initArgs(SqlLine.java:553)
>>>>       at sqlline.SqlLine.begin(SqlLine.java:596)
>>>>       at sqlline.SqlLine.start(SqlLine.java:375)
>>>>       at sqlline.SqlLine.main(SqlLine.java:268)
>>>> Caused by: org.apache.drill.common.exceptions.UserException:
>> CONNECTION
>>>> ERROR: Connection /10.*.*.1:41613 <--> datanodeN/10.*.*.5:31010 (user
>>>> client) closed unexpectedly. Drillbit down?
>>>> 
>>>> 
>>>> [Error Id: 5089f2f1-0dfd-40f8-9fa0-8276c08be53f ]
>>>>       at
>>>> org.apache.drill.common.exceptions.UserException$
>>> Builder.build(UserException.java:543)
>>>>       at
>>>> org.apache.drill.exec.rpc.user.QueryResultHandler$
>>> ChannelClosedHandler$1.operationComplete(QueryResultHandler.java:373)
>>>>       at
>>>> io.netty.util.concurrent.DefaultPromise.notifyListener0(
>>> DefaultPromise.java:680)
>>>>       at
>>>> io.netty.util.concurrent.DefaultPromise.notifyListeners0(
>>> DefaultPromise.java:603)
>>>>       at
>>>> io.netty.util.concurrent.DefaultPromise.notifyListeners(
>>> DefaultPromise.java:563)
>>>>       at
>>>> io.netty.util.concurrent.DefaultPromise.trySuccess(
>>> DefaultPromise.java:406)
>>>>       at
>>>> io.netty.channel.DefaultChannelPromise.trySuccess(
>>> DefaultChannelPromise.java:82)
>>>>       at
>>>> io.netty.channel.AbstractChannel$CloseFuture.
>> setClosed(AbstractChannel.
>>> java:943)
>>>>       at
>>>> io.netty.channel.AbstractChannel$AbstractUnsafe.doClose0(
>>> AbstractChannel.java:592)
>>>>       at
>>>> io.netty.channel.AbstractChannel$AbstractUnsafe.close(
>>> AbstractChannel.java:584)
>>>>       at
>>>> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.closeOnRead(
>>> AbstractNioByteChannel.java:71)
>>>>       at
>>>> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.
>>> handleReadException(AbstractNioByteChannel.java:89)
>>>>       at
>>>> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(
>>> AbstractNioByteChannel.java:162)
>>>>       at
>>>> io.netty.channel.nio.NioEventLoop.processSelectedKey(
>>> NioEventLoop.java:511)
>>>>       at
>>>> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(
>>> NioEventLoop.java:468)
>>>>       at
>>>> io.netty.channel.nio.NioEventLoop.processSelectedKeys(
>>> NioEventLoop.java:382)
>>>>       at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
>>>>       at
>>>> io.netty.util.concurrent.SingleThreadEventExecutor$2.
>>> run(SingleThreadEventExecutor.java:111)
>>>>       at java.lang.Thread.run(Thread.java:745)
>>>> 
>>>> 
>>>> Regards,
>>>> *Anup Tiwari*
>>> 
>>> 
>> 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message