drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anup Tiwari <anup.tiw...@games24x7.com>
Subject Re: [Drill 1.9.0] : [CONNECTION ERROR] :- (user client) closed unexpectedly. Drillbit down?
Date Sat, 24 Dec 2016 04:50:54 GMT
Hi Sudheesh,

Please find below ans :-

1. Total 4,(3 Datanodes, 1 namenode)
2. Only one query, as this query is part of daily dump and runs in early
morning.

And as @chun mentioned , it seems similar to DRILL-4708 , so any update on
progress of this ticket?

On 22-Dec-2016 12:13 AM, "Sudheesh Katkam" <skatkam@maprtech.com> wrote:

Two more questions..

(1) How many nodes in your cluster?
(2) How many queries are running when the failure is seen?

If you have multiple large queries running at the same time, the load on
the system could cause those failures (which are heartbeat related).

The two options I suggested decrease the parallelism of stages in a query,
this implies lesser load but slower execution.

System level option affect all queries, and session level affect queries on
a specific connection. Not sure what is preferred in your environment.

Also, you may be interested in metrics. More info here:

http://drill.apache.org/docs/monitoring-metrics/ <
http://drill.apache.org/docs/monitoring-metrics/>

Thank you,
Sudheesh

> On Dec 21, 2016, at 4:31 AM, Anup Tiwari <anup.tiwari@games24x7.com>
wrote:
>
> @sudheesh, yes drill bit is running on datanodeN/10.*.*.5:31010).
>
> Can you tell me how this will impact to query and do i have to set this at
> session level OR system level?
>
>
>
> Regards,
> *Anup Tiwari*
>
> On Tue, Dec 20, 2016 at 11:59 PM, Chun Chang <cchang@maprtech.com> wrote:
>
>> I am pretty sure this is the same as DRILL-4708.
>>
>> On Tue, Dec 20, 2016 at 10:27 AM, Sudheesh Katkam <skatkam@maprtech.com>
>> wrote:
>>
>>> Is the drillbit service (running on datanodeN/10.*.*.5:31010) actually
>>> down when the error is seen?
>>>
>>> If not, try lowering parallelism using these two session options, before
>>> running the queries:
>>>
>>> planner.width.max_per_node (decrease this)
>>> planner.slice_target (increase this)
>>>
>>> Thank you,
>>> Sudheesh
>>>
>>>> On Dec 20, 2016, at 12:28 AM, Anup Tiwari <anup.tiwari@games24x7.com>
>>> wrote:
>>>>
>>>> Hi Team,
>>>>
>>>> We are running some drill automation script on a daily basis and we
>> often
>>>> see that some query gets failed frequently by giving below error ,
>> Also i
>>>> came across DRILL-4708 <https://issues.apache.org/
>> jira/browse/DRILL-4708
>>>>
>>>> which seems similar, Can anyone give me update on that OR workaround to
>>>> avoid such issue ?
>>>>
>>>> *Stack Trace :-*
>>>>
>>>> Error: CONNECTION ERROR: Connection /10.*.*.1:41613 <-->
>>>> datanodeN/10.*.*.5:31010 (user client) closed unexpectedly. Drillbit
>>> down?
>>>>
>>>>
>>>> [Error Id: 5089f2f1-0dfd-40f8-9fa0-8276c08be53f ] (state=,code=0)
>>>> java.sql.SQLException: CONNECTION ERROR: Connection /10.*.*.1:41613
>> <-->
>>>> datanodeN/10.*.*.5:31010 (user client) closed unexpectedly. Drillb
>>>> it down?
>>>>
>>>>
>>>> [Error Id: 5089f2f1-0dfd-40f8-9fa0-8276c08be53f ]
>>>>       at
>>>> org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(
>>> DrillCursor.java:232)
>>>>       at
>>>> org.apache.drill.jdbc.impl.DrillCursor.loadInitialSchema(
>>> DrillCursor.java:275)
>>>>       at
>>>> org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(
>>> DrillResultSetImpl.java:1943)
>>>>       at
>>>> org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(
>>> DrillResultSetImpl.java:76)
>>>>       at
>>>> org.apache.calcite.avatica.AvaticaConnection$1.execute(
>>> AvaticaConnection.java:473)
>>>>       at
>>>> org.apache.drill.jdbc.impl.DrillMetaImpl.prepareAndExecute(
>>> DrillMetaImpl.java:465)
>>>>       at
>>>> org.apache.calcite.avatica.AvaticaConnection.
>> prepareAndExecuteInternal(
>>> AvaticaConnection.java:477)
>>>>       at
>>>> org.apache.drill.jdbc.impl.DrillConnectionImpl.
>>> prepareAndExecuteInternal(DrillConnectionImpl.java:169)
>>>>       at
>>>> org.apache.calcite.avatica.AvaticaStatement.executeInternal(
>>> AvaticaStatement.java:109)
>>>>       at
>>>> org.apache.calcite.avatica.AvaticaStatement.execute(
>>> AvaticaStatement.java:121)
>>>>       at
>>>> org.apache.drill.jdbc.impl.DrillStatementImpl.execute(
>>> DrillStatementImpl.java:101)
>>>>       at sqlline.Commands.execute(Commands.java:841)
>>>>       at sqlline.Commands.sql(Commands.java:751)
>>>>       at sqlline.SqlLine.dispatch(SqlLine.java:746)
>>>>       at sqlline.SqlLine.runCommands(SqlLine.java:1651)
>>>>       at sqlline.Commands.run(Commands.java:1304)
>>>>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>       at
>>>> sun.reflect.NativeMethodAccessorImpl.invoke(
>>> NativeMethodAccessorImpl.java:62)
>>>>       at
>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(
>>> DelegatingMethodAccessorImpl.java:43)
>>>>       at java.lang.reflect.Method.invoke(Method.java:498)
>>>>       at
>>>> sqlline.ReflectiveCommandHandler.execute(
>> ReflectiveCommandHandler.java:
>>> 36)
>>>>       at sqlline.SqlLine.dispatch(SqlLine.java:742)
>>>>       at sqlline.SqlLine.initArgs(SqlLine.java:553)
>>>>       at sqlline.SqlLine.begin(SqlLine.java:596)
>>>>       at sqlline.SqlLine.start(SqlLine.java:375)
>>>>       at sqlline.SqlLine.main(SqlLine.java:268)
>>>> Caused by: org.apache.drill.common.exceptions.UserException:
>> CONNECTION
>>>> ERROR: Connection /10.*.*.1:41613 <--> datanodeN/10.*.*.5:31010 (user
>>>> client) closed unexpectedly. Drillbit down?
>>>>
>>>>
>>>> [Error Id: 5089f2f1-0dfd-40f8-9fa0-8276c08be53f ]
>>>>       at
>>>> org.apache.drill.common.exceptions.UserException$
>>> Builder.build(UserException.java:543)
>>>>       at
>>>> org.apache.drill.exec.rpc.user.QueryResultHandler$
>>> ChannelClosedHandler$1.operationComplete(QueryResultHandler.java:373)
>>>>       at
>>>> io.netty.util.concurrent.DefaultPromise.notifyListener0(
>>> DefaultPromise.java:680)
>>>>       at
>>>> io.netty.util.concurrent.DefaultPromise.notifyListeners0(
>>> DefaultPromise.java:603)
>>>>       at
>>>> io.netty.util.concurrent.DefaultPromise.notifyListeners(
>>> DefaultPromise.java:563)
>>>>       at
>>>> io.netty.util.concurrent.DefaultPromise.trySuccess(
>>> DefaultPromise.java:406)
>>>>       at
>>>> io.netty.channel.DefaultChannelPromise.trySuccess(
>>> DefaultChannelPromise.java:82)
>>>>       at
>>>> io.netty.channel.AbstractChannel$CloseFuture.
>> setClosed(AbstractChannel.
>>> java:943)
>>>>       at
>>>> io.netty.channel.AbstractChannel$AbstractUnsafe.doClose0(
>>> AbstractChannel.java:592)
>>>>       at
>>>> io.netty.channel.AbstractChannel$AbstractUnsafe.close(
>>> AbstractChannel.java:584)
>>>>       at
>>>> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.closeOnRead(
>>> AbstractNioByteChannel.java:71)
>>>>       at
>>>> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.
>>> handleReadException(AbstractNioByteChannel.java:89)
>>>>       at
>>>> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(
>>> AbstractNioByteChannel.java:162)
>>>>       at
>>>> io.netty.channel.nio.NioEventLoop.processSelectedKey(
>>> NioEventLoop.java:511)
>>>>       at
>>>> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(
>>> NioEventLoop.java:468)
>>>>       at
>>>> io.netty.channel.nio.NioEventLoop.processSelectedKeys(
>>> NioEventLoop.java:382)
>>>>       at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
>>>>       at
>>>> io.netty.util.concurrent.SingleThreadEventExecutor$2.
>>> run(SingleThreadEventExecutor.java:111)
>>>>       at java.lang.Thread.run(Thread.java:745)
>>>>
>>>>
>>>> Regards,
>>>> *Anup Tiwari*
>>>
>>>
>>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message