drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Hou <r...@mapr.com>
Subject Re: ***UNCHECKED*** Re: Query Error on PCAP over MapR FS
Date Tue, 12 Sep 2017 22:52:03 GMT
I asked a couple of Drill developers.  We don't have much experience with PCAP yet.  Takeo,
can you file a Jira for this, and include the information below?  The error message mentions
a bad magic number, which Drill sometimes uses to help determine the file format.


Also, it appears that you have tried to query your data across many small files rather than
one large file.  This is the preferred approach, and it seems that this approach works for
you.  Please let me know if you think otherwise, that you need to access your data in one
large PCAP file.


Thanks.


--Robert


________________________________
From: Ted Dunning <ted.dunning@gmail.com>
Sent: Monday, September 11, 2017 8:15 PM
To: user; jni@apache.org
Subject: Re: ***UNCHECKED*** Re: Query Error on PCAP over MapR FS

This stack trace makes it clear that this is a bug in the PCAP decoder
caused by a misunderstanding of how to force large files to be read in one
batch on a single drillBit.

Are there some real Drill experts out there who can provide hints about how
to avoid this?



On Tue, Sep 12, 2017 at 5:03 AM, Takeo Ogawara <ta-ogawara@kddi-research.jp>
wrote:

> Sorry
>
> I paste plain texts.
>
> > 2017-09-11 15:06:52,390 [BitServer-2] WARN  o.a.d.exec.rpc.control.WorkEventBus
> - A fragment message arrived but there was no registered listener for that
> message: profile {
> >   state: FAILED
> >   error {
> >     error_id: "bbf284b6-9da4-4869-ac20-fa100eed11b9"
> >     endpoint {
> >       address: "node22"
> >       user_port: 31010
> >       control_port: 31011
> >       data_port: 31012
> >       version: "1.11.0"
> >     }
> >     error_type: SYSTEM
> >     message: "SYSTEM ERROR: IllegalStateException: Bad magic number =
> 0a0d0d0a\n\nFragment 1:200\n\n[Error Id: bbf284b6-9da4-4869-ac20-fa100eed11b9
> on node22:31010]"
> >     exception {
> >       exception_class: "java.lang.IllegalStateException"
> >       message: "Bad magic number = 0a0d0d0a"
> >       stack_trace {
> >         class_name: "com.google.common.base.Preconditions"
> >         file_name: "Preconditions.java"
> >         line_number: 173
> >         method_name: "checkState"
> >         is_native_method: false
> >       }
> >       stack_trace {
> >         class_name: "org.apache.drill.exec.store.
> pcap.decoder.PacketDecoder"
> >         file_name: "PacketDecoder.java"
> >         line_number: 84
> >         method_name: "<init>"
> >         is_native_method: false
> >       }
> >       stack_trace {
> >         class_name: "org.apache.drill.exec.store.pcap.PcapRecordReader"
> >         file_name: "PcapRecordReader.java"
> >         line_number: 104
> >         method_name: "setup"
> >         is_native_method: false
> >       }
> >       stack_trace {
> >         class_name: "org.apache.drill.exec.physical.impl.ScanBatch"
> >         file_name: "ScanBatch.java"
> >         line_number: 104
> >         method_name: "<init>"
> >         is_native_method: false
> >       }
> >       stack_trace {
> >         class_name: "org.apache.drill.exec.store.
> dfs.easy.EasyFormatPlugin"
> >         file_name: "EasyFormatPlugin.java"
> >         line_number: 166
> >         method_name: "getReaderBatch"
> >         is_native_method: false
> >       }
> >       stack_trace {
> >         class_name: "org.apache.drill.exec.store.dfs.easy.
> EasyReaderBatchCreator"
> >         file_name: "EasyReaderBatchCreator.java"
> >         line_number: 35
> >         method_name: "getBatch"
> >         is_native_method: false
> >       }
> >       stack_trace {
> >         class_name: "org.apache.drill.exec.store.dfs.easy.
> EasyReaderBatchCreator"
> >         file_name: "EasyReaderBatchCreator.java"
> >         line_number: 28
> >         method_name: "getBatch"
> >         is_native_method: false
> >       }
> >       stack_trace {
> >         class_name: "org.apache.drill.exec.physical.impl.ImplCreator"
> >         file_name: "ImplCreator.java"
> >         line_number: 156
> >         method_name: "getRecordBatch"
> >         is_native_method: false
> >       }
> >       stack_trace {
> >         class_name: "org.apache.drill.exec.physical.impl.ImplCreator"
> >         file_name: "ImplCreator.java"
> >         line_number: 179
> >         method_name: "getChildren"
> >         is_native_method: false
> >       }
> >       stack_trace {
> >         class_name: "org.apache.drill.exec.physical.impl.ImplCreator"
> >         file_name: "ImplCreator.java"
> >         line_number: 136
> >         method_name: "getRecordBatch"
> >         is_native_method: false
> >       }
> >       stack_trace {
> >         class_name: "org.apache.drill.exec.physical.impl.ImplCreator"
> >         file_name: "ImplCreator.java"
> >         line_number: 179
> >         method_name: "getChildren"
> >         is_native_method: false
> >       }
> >       stack_trace {
> >         class_name: "org.apache.drill.exec.physical.impl.ImplCreator"
> >         file_name: "ImplCreator.java"
> >         line_number: 136
> >         method_name: "getRecordBatch"
> >         is_native_method: false
> >       }
> >       stack_trace {
> >         class_name: "org.apache.drill.exec.physical.impl.ImplCreator"
> >         file_name: "ImplCreator.java"
> >         line_number: 179
> >         method_name: "getChildren"
> >         is_native_method: false
> >       }
> >       stack_trace {
> >         class_name: "org.apache.drill.exec.physical.impl.ImplCreator"
> >         file_name: "ImplCreator.java"
> >         line_number: 109
> >         method_name: "getRootExec"
> >         is_native_method: false
> >       }
> >       stack_trace {
> >         class_name: "org.apache.drill.exec.physical.impl.ImplCreator"
> >         file_name: "ImplCreator.java"
> >         line_number: 87
> >         method_name: "getExec"
> >         is_native_method: false
> >       }
> >       stack_trace {
> >         class_name: "org.apache.drill.exec.work.
> fragment.FragmentExecutor"
> >         file_name: "FragmentExecutor.java"
> >         line_number: 207
> >         method_name: "run"
> >         is_native_method: false
> >       }
> >       stack_trace {
> >         class_name: "org.apache.drill.common.SelfCleaningRunnable"
> >         file_name: "SelfCleaningRunnable.java"
> >         line_number: 38
> >         method_name: "run"
> >         is_native_method: false
> >       }
> >       stack_trace {
> >         class_name: "..."
> >         line_number: 0
> >         method_name: "..."
> >         is_native_method: false
> >       }
> >     }
> >   }
> >   minor_fragment_id: 200
> >   operator_profile {
> >     input_profile {
> >       records: 0
> >       batches: 0
> >       schemas: 0
> >     }
> >     operator_id: 0
> >     operator_type: 37
> >     setup_nanos: 0
> >     process_nanos: 29498572
> >     peak_local_memory_allocated: 0
> >     wait_nanos: 0
> >   }
> >   start_time: 1505110011975
> >   end_time: 1505110012320
> >   memory_used: 0
> >   max_memory_used: 1000000
> >   endpoint {
> >     address: "node22"
> >     user_port: 31010
> >     control_port: 31011
> >     data_port: 31012
> >     version: "1.11.0"
> >   }
> > }
> > handle {
> >   query_id {
> >     part1: 2758973773160297386
> >     part2: -412723615757922113
> >   }
> >   major_fragment_id: 1
> >   minor_fragment_id: 200
> > }
> > .
>
>
> > [Error Id: c737dd8b-78e4-40c6-89b0-d53260770b11 on node21:31010]
> >         at org.apache.drill.exec.rpc.user.QueryResultHandler.
> resultArrived(QueryResultHandler.java:123) [drill-java-exec-1.11.0.jar:1.
> 11.0]
> >         at org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:368)
> [drill-java-exec-1.11.0.jar:1.11.0]
> >         at org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:90)
> [drill-java-exec-1.11.0.jar:1.11.0]
> >         at org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:274)
> [drill-rpc-1.11.0.jar:1.11.0]
> >         at org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:244)
> [drill-rpc-1.11.0.jar:1.11.0]
> >         at io.netty.handler.codec.MessageToMessageDecoder.channelRead(
> MessageToMessageDecoder.java:89) [netty-codec-4.0.27.Final.jar:
> 4.0.27.Final]
> >         at io.netty.channel.AbstractChannelHandlerContext.
> invokeChannelRead(AbstractChannelHandlerContext.java:339)
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> >         at io.netty.channel.AbstractChannelHandlerContext.
> fireChannelRead(AbstractChannelHandlerContext.java:324)
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> >         at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:254)
> [netty-handler-4.0.27.Final.jar:4.0.27.Final]
> >         at io.netty.channel.AbstractChannelHandlerContext.
> invokeChannelRead(AbstractChannelHandlerContext.java:339)
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> >         at io.netty.channel.AbstractChannelHandlerContext.
> fireChannelRead(AbstractChannelHandlerContext.java:324)
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> >         at io.netty.handler.codec.MessageToMessageDecoder.channelRead(
> MessageToMessageDecoder.java:103) [netty-codec-4.0.27.Final.jar:
> 4.0.27.Final]
> >         at io.netty.channel.AbstractChannelHandlerContext.
> invokeChannelRead(AbstractChannelHandlerContext.java:339)
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> >         at io.netty.channel.AbstractChannelHandlerContext.
> fireChannelRead(AbstractChannelHandlerContext.java:324)
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> >         at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:242)
> [netty-codec-4.0.27.Final.jar:4.0.27.Final]
> >         at io.netty.channel.AbstractChannelHandlerContext.
> invokeChannelRead(AbstractChannelHandlerContext.java:339)
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> >         at io.netty.channel.AbstractChannelHandlerContext.
> fireChannelRead(AbstractChannelHandlerContext.java:324)
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> >         at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(
> ChannelInboundHandlerAdapter.java:86) [netty-transport-4.0.27.Final.
> jar:4.0.27.Final]
> >         at io.netty.channel.AbstractChannelHandlerContext.
> invokeChannelRead(AbstractChannelHandlerContext.java:339)
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> >         at io.netty.channel.AbstractChannelHandlerContext.
> fireChannelRead(AbstractChannelHandlerContext.java:324)
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> >         at io.netty.channel.DefaultChannelPipeline.fireChannelRead(
> DefaultChannelPipeline.java:847) [netty-transport-4.0.27.Final.
> jar:4.0.27.Final]
> >         at io.netty.channel.nio.AbstractNioByteChannel$
> NioByteUnsafe.read(AbstractNioByteChannel.java:131)
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> >         at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> >         at io.netty.channel.nio.NioEventLoop.
> processSelectedKeysOptimized(NioEventLoop.java:468)
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> >         at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> >         at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> >         at io.netty.util.concurrent.SingleThreadEventExecutor$2.
> run(SingleThreadEventExecutor.java:111) [netty-common-4.0.27.Final.
> jar:4.0.27.Final]
> >         at java.lang.Thread.run(Thread.java:748) [na:1.7.0_141]
> > 2017-09-11 15:32:36,406 [Client-1] INFO  o.a.d.j.i.DrillCursor$ResultsListener
> - [#5] Query failed:
> > org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR:
> IllegalStateException: Bad magic number = 0a0d0d0a
> >
> >
> >
>
>
>
> > 2017/09/12 11:53、Takeo Ogawara <ta-ogawara@kddi-research.jp>のメール:
> >
> > Thank you for replies.
> >
> >> Instead of "location": "/mapr/cluster3",   use "location": "/",
> > I’ll use this config.
> >
> >
> >> Can you provide the stack trace from the Drillbit that hit the problem?
> >
> > You can find logs in attached files.
> >
> >> Is it absolutely required to query large files like this? Would it be
> >> acceptable to split the file first by making a quick scan over it?
> > No,loading large file isn’t necessarily required.
> > In fact, this large PCAP file is created by concatenating small PCAP
> files with mergecap command.
> > So there is no problem with input small PCAP files into Drill.
> >
> > How can I analyze numbers of PCAP files together?
> > Can I concatenate parsed packet records of small PCAP files inside Drill
> query?
> > Or should I export parsed records into a database and then merge them?
> >
> >
> >> 2017/09/12 5:07、Ted Dunning <ted.dunning@gmail.com>のメール:
> >>
> >> On Mon, Sep 11, 2017 at 11:23 AM, Takeo Ogawara <
> ta-ogawara@kddi-research.jp
> >>> wrote:
> >>
> >>> ...
> >>>
> >>> 1. Query error when cluster-name is not specified
> >>> ...
> >>>
> >>> With this setting, the following query failed.
> >>>> select * from mfs.`x.pcap` ;
> >>>> Error: DATA_READ ERROR: /x.pcap (No such file or directory)
> >>>>
> >>>> File name: /x.pcap
> >>>> Fragment 0:0
> >>>>
> >>>> [Error Id: 70b73062-c3ed-4a10-9a88-034b4e6d039a on node21:31010]
> >>> (state=,code=0)
> >>>
> >>> But these queries passed.
> >>>> select * from mfs.root.`x.pcap` ;
> >>>> select * from mfs.`x.csv`;
> >>>> select * from mfs.root.`x.csv`;
> >>>
> >>
> >> As Andries mentioned, the problem here has to do with understanding what
> >> Drill is thinking about how paths are manipulated. Nothing to do with
> the
> >> PCAP capabilities.
> >>
> >> Usually, what I do is put entries into the configuration which directly
> >> point to the directory above my data, but I can't add anything Andries
> >> comment.
> >>
> >>
> >>> 2. Large PCAP file
> >>> Query on very large PCAP file (larger than 100GB) failed with following
> >>> error message.
> >>>> Error: SYSTEM ERROR: IllegalStateException: Bad magic number =
> 0a0d0d0a
> >>>>
> >>>> Fragment 1:169
> >>>>
> >>>> [Error Id: 8882c359-c253-40c0-866c-417ef1ce5aa3 on node22:31010]
> >>> (state=,code=0)
> >>>
> >>> This happens even on Linux FS not MapR FS
> >>>
> >>
> >> Can you provide the stack trace from the Drillbit that hit the problem?
> >>
> >> I suspect that this has to do with splitting of the PCAP file.
> Normally, it
> >> is assumed that parallelism will be achieved by having lots of smaller
> >> files since it is difficult to jump into the middle of a PCAP file and
> get
> >> good results.
> >>
> >> Even if we disable splitting to avoid this error, you will have the
> >> complementary problem of slow queries due to single-threading. That
> doesn't
> >> seem very satisfactory either.
> >>
> >> A similar problem is that splitting a PCAP file pretty much requires a
> >> single-threaded read of the file in question. The read doesn't need to
> >> process very much data, but it does need to touch the whole file.
> >>
> >> Is it absolutely required to query large files like this? Would it be
> >> acceptable to split the file first by making a quick scan over it?
> >
> >
> > <sa1153582.zip>
>
> ———————————————————————
>       <KDDI総合研究所 ビジョン>
> Challenge for the future 豊かな未来への挑戦
> ———————————————————————
>             英雄だけの夏。
>     https://www.au.com/pr/cm/3taro/
[https://kddi-h.assetsadobe3.com/is/image/content/dam/au-com/pr/cm/3taro/images/og-3t.jpg?scl=1]<https://www.au.com/pr/cm/3taro/>

【au】三太郎スペシャルページ<https://www.au.com/pr/cm/3taro/>
www.au.com
桃太郎、浦島太郎、金太郎や、かぐや姫、乙姫、鬼ちゃん、織姫が繰り広げる「au三太郎」のCMギャラリーです。最新動画や
...



> ———————————————————————
> 小河原 健生(Takeo Ogawara)
> (株)KDDI総合研究所
> コネクティッドカー1G
>
> TEL:049-278-7495 / 070-3623-9914
>
>
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message