This stack trace makes it clear that this is a bug in the PCAP decoder
caused by a misunderstanding of how to force large files to be read in one
batch on a single drillBit.
Are there some real Drill experts out there who can provide hints about how
to avoid this?
On Tue, Sep 12, 2017 at 5:03 AM, Takeo Ogawara <ta-ogawara@kddi-research.jp>
wrote:
> Sorry
>
> I paste plain texts.
>
> > 2017-09-11 15:06:52,390 [BitServer-2] WARN o.a.d.exec.rpc.control.WorkEventBus
> - A fragment message arrived but there was no registered listener for that
> message: profile {
> > state: FAILED
> > error {
> > error_id: "bbf284b6-9da4-4869-ac20-fa100eed11b9"
> > endpoint {
> > address: "node22"
> > user_port: 31010
> > control_port: 31011
> > data_port: 31012
> > version: "1.11.0"
> > }
> > error_type: SYSTEM
> > message: "SYSTEM ERROR: IllegalStateException: Bad magic number =
> 0a0d0d0a\n\nFragment 1:200\n\n[Error Id: bbf284b6-9da4-4869-ac20-fa100eed11b9
> on node22:31010]"
> > exception {
> > exception_class: "java.lang.IllegalStateException"
> > message: "Bad magic number = 0a0d0d0a"
> > stack_trace {
> > class_name: "com.google.common.base.Preconditions"
> > file_name: "Preconditions.java"
> > line_number: 173
> > method_name: "checkState"
> > is_native_method: false
> > }
> > stack_trace {
> > class_name: "org.apache.drill.exec.store.
> pcap.decoder.PacketDecoder"
> > file_name: "PacketDecoder.java"
> > line_number: 84
> > method_name: "<init>"
> > is_native_method: false
> > }
> > stack_trace {
> > class_name: "org.apache.drill.exec.store.pcap.PcapRecordReader"
> > file_name: "PcapRecordReader.java"
> > line_number: 104
> > method_name: "setup"
> > is_native_method: false
> > }
> > stack_trace {
> > class_name: "org.apache.drill.exec.physical.impl.ScanBatch"
> > file_name: "ScanBatch.java"
> > line_number: 104
> > method_name: "<init>"
> > is_native_method: false
> > }
> > stack_trace {
> > class_name: "org.apache.drill.exec.store.
> dfs.easy.EasyFormatPlugin"
> > file_name: "EasyFormatPlugin.java"
> > line_number: 166
> > method_name: "getReaderBatch"
> > is_native_method: false
> > }
> > stack_trace {
> > class_name: "org.apache.drill.exec.store.dfs.easy.
> EasyReaderBatchCreator"
> > file_name: "EasyReaderBatchCreator.java"
> > line_number: 35
> > method_name: "getBatch"
> > is_native_method: false
> > }
> > stack_trace {
> > class_name: "org.apache.drill.exec.store.dfs.easy.
> EasyReaderBatchCreator"
> > file_name: "EasyReaderBatchCreator.java"
> > line_number: 28
> > method_name: "getBatch"
> > is_native_method: false
> > }
> > stack_trace {
> > class_name: "org.apache.drill.exec.physical.impl.ImplCreator"
> > file_name: "ImplCreator.java"
> > line_number: 156
> > method_name: "getRecordBatch"
> > is_native_method: false
> > }
> > stack_trace {
> > class_name: "org.apache.drill.exec.physical.impl.ImplCreator"
> > file_name: "ImplCreator.java"
> > line_number: 179
> > method_name: "getChildren"
> > is_native_method: false
> > }
> > stack_trace {
> > class_name: "org.apache.drill.exec.physical.impl.ImplCreator"
> > file_name: "ImplCreator.java"
> > line_number: 136
> > method_name: "getRecordBatch"
> > is_native_method: false
> > }
> > stack_trace {
> > class_name: "org.apache.drill.exec.physical.impl.ImplCreator"
> > file_name: "ImplCreator.java"
> > line_number: 179
> > method_name: "getChildren"
> > is_native_method: false
> > }
> > stack_trace {
> > class_name: "org.apache.drill.exec.physical.impl.ImplCreator"
> > file_name: "ImplCreator.java"
> > line_number: 136
> > method_name: "getRecordBatch"
> > is_native_method: false
> > }
> > stack_trace {
> > class_name: "org.apache.drill.exec.physical.impl.ImplCreator"
> > file_name: "ImplCreator.java"
> > line_number: 179
> > method_name: "getChildren"
> > is_native_method: false
> > }
> > stack_trace {
> > class_name: "org.apache.drill.exec.physical.impl.ImplCreator"
> > file_name: "ImplCreator.java"
> > line_number: 109
> > method_name: "getRootExec"
> > is_native_method: false
> > }
> > stack_trace {
> > class_name: "org.apache.drill.exec.physical.impl.ImplCreator"
> > file_name: "ImplCreator.java"
> > line_number: 87
> > method_name: "getExec"
> > is_native_method: false
> > }
> > stack_trace {
> > class_name: "org.apache.drill.exec.work.
> fragment.FragmentExecutor"
> > file_name: "FragmentExecutor.java"
> > line_number: 207
> > method_name: "run"
> > is_native_method: false
> > }
> > stack_trace {
> > class_name: "org.apache.drill.common.SelfCleaningRunnable"
> > file_name: "SelfCleaningRunnable.java"
> > line_number: 38
> > method_name: "run"
> > is_native_method: false
> > }
> > stack_trace {
> > class_name: "..."
> > line_number: 0
> > method_name: "..."
> > is_native_method: false
> > }
> > }
> > }
> > minor_fragment_id: 200
> > operator_profile {
> > input_profile {
> > records: 0
> > batches: 0
> > schemas: 0
> > }
> > operator_id: 0
> > operator_type: 37
> > setup_nanos: 0
> > process_nanos: 29498572
> > peak_local_memory_allocated: 0
> > wait_nanos: 0
> > }
> > start_time: 1505110011975
> > end_time: 1505110012320
> > memory_used: 0
> > max_memory_used: 1000000
> > endpoint {
> > address: "node22"
> > user_port: 31010
> > control_port: 31011
> > data_port: 31012
> > version: "1.11.0"
> > }
> > }
> > handle {
> > query_id {
> > part1: 2758973773160297386
> > part2: -412723615757922113
> > }
> > major_fragment_id: 1
> > minor_fragment_id: 200
> > }
> > .
>
>
> > [Error Id: c737dd8b-78e4-40c6-89b0-d53260770b11 on node21:31010]
> > at org.apache.drill.exec.rpc.user.QueryResultHandler.
> resultArrived(QueryResultHandler.java:123) [drill-java-exec-1.11.0.jar:1.
> 11.0]
> > at org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:368)
> [drill-java-exec-1.11.0.jar:1.11.0]
> > at org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:90)
> [drill-java-exec-1.11.0.jar:1.11.0]
> > at org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:274)
> [drill-rpc-1.11.0.jar:1.11.0]
> > at org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:244)
> [drill-rpc-1.11.0.jar:1.11.0]
> > at io.netty.handler.codec.MessageToMessageDecoder.channelRead(
> MessageToMessageDecoder.java:89) [netty-codec-4.0.27.Final.jar:
> 4.0.27.Final]
> > at io.netty.channel.AbstractChannelHandlerContext.
> invokeChannelRead(AbstractChannelHandlerContext.java:339)
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> > at io.netty.channel.AbstractChannelHandlerContext.
> fireChannelRead(AbstractChannelHandlerContext.java:324)
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> > at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:254)
> [netty-handler-4.0.27.Final.jar:4.0.27.Final]
> > at io.netty.channel.AbstractChannelHandlerContext.
> invokeChannelRead(AbstractChannelHandlerContext.java:339)
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> > at io.netty.channel.AbstractChannelHandlerContext.
> fireChannelRead(AbstractChannelHandlerContext.java:324)
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> > at io.netty.handler.codec.MessageToMessageDecoder.channelRead(
> MessageToMessageDecoder.java:103) [netty-codec-4.0.27.Final.jar:
> 4.0.27.Final]
> > at io.netty.channel.AbstractChannelHandlerContext.
> invokeChannelRead(AbstractChannelHandlerContext.java:339)
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> > at io.netty.channel.AbstractChannelHandlerContext.
> fireChannelRead(AbstractChannelHandlerContext.java:324)
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> > at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:242)
> [netty-codec-4.0.27.Final.jar:4.0.27.Final]
> > at io.netty.channel.AbstractChannelHandlerContext.
> invokeChannelRead(AbstractChannelHandlerContext.java:339)
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> > at io.netty.channel.AbstractChannelHandlerContext.
> fireChannelRead(AbstractChannelHandlerContext.java:324)
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> > at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(
> ChannelInboundHandlerAdapter.java:86) [netty-transport-4.0.27.Final.
> jar:4.0.27.Final]
> > at io.netty.channel.AbstractChannelHandlerContext.
> invokeChannelRead(AbstractChannelHandlerContext.java:339)
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> > at io.netty.channel.AbstractChannelHandlerContext.
> fireChannelRead(AbstractChannelHandlerContext.java:324)
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> > at io.netty.channel.DefaultChannelPipeline.fireChannelRead(
> DefaultChannelPipeline.java:847) [netty-transport-4.0.27.Final.
> jar:4.0.27.Final]
> > at io.netty.channel.nio.AbstractNioByteChannel$
> NioByteUnsafe.read(AbstractNioByteChannel.java:131)
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> > at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> > at io.netty.channel.nio.NioEventLoop.
> processSelectedKeysOptimized(NioEventLoop.java:468)
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> > at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> > at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> > at io.netty.util.concurrent.SingleThreadEventExecutor$2.
> run(SingleThreadEventExecutor.java:111) [netty-common-4.0.27.Final.
> jar:4.0.27.Final]
> > at java.lang.Thread.run(Thread.java:748) [na:1.7.0_141]
> > 2017-09-11 15:32:36,406 [Client-1] INFO o.a.d.j.i.DrillCursor$ResultsListener
> - [#5] Query failed:
> > org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR:
> IllegalStateException: Bad magic number = 0a0d0d0a
> >
> >
> >
>
>
>
> > 2017/09/12 11:53、Takeo Ogawara <ta-ogawara@kddi-research.jp>のメール:
> >
> > Thank you for replies.
> >
> >> Instead of "location": "/mapr/cluster3", use "location": "/",
> > I’ll use this config.
> >
> >
> >> Can you provide the stack trace from the Drillbit that hit the problem?
> >
> > You can find logs in attached files.
> >
> >> Is it absolutely required to query large files like this? Would it be
> >> acceptable to split the file first by making a quick scan over it?
> > No,loading large file isn’t necessarily required.
> > In fact, this large PCAP file is created by concatenating small PCAP
> files with mergecap command.
> > So there is no problem with input small PCAP files into Drill.
> >
> > How can I analyze numbers of PCAP files together?
> > Can I concatenate parsed packet records of small PCAP files inside Drill
> query?
> > Or should I export parsed records into a database and then merge them?
> >
> >
> >> 2017/09/12 5:07、Ted Dunning <ted.dunning@gmail.com>のメール:
> >>
> >> On Mon, Sep 11, 2017 at 11:23 AM, Takeo Ogawara <
> ta-ogawara@kddi-research.jp
> >>> wrote:
> >>
> >>> ...
> >>>
> >>> 1. Query error when cluster-name is not specified
> >>> ...
> >>>
> >>> With this setting, the following query failed.
> >>>> select * from mfs.`x.pcap` ;
> >>>> Error: DATA_READ ERROR: /x.pcap (No such file or directory)
> >>>>
> >>>> File name: /x.pcap
> >>>> Fragment 0:0
> >>>>
> >>>> [Error Id: 70b73062-c3ed-4a10-9a88-034b4e6d039a on node21:31010]
> >>> (state=,code=0)
> >>>
> >>> But these queries passed.
> >>>> select * from mfs.root.`x.pcap` ;
> >>>> select * from mfs.`x.csv`;
> >>>> select * from mfs.root.`x.csv`;
> >>>
> >>
> >> As Andries mentioned, the problem here has to do with understanding what
> >> Drill is thinking about how paths are manipulated. Nothing to do with
> the
> >> PCAP capabilities.
> >>
> >> Usually, what I do is put entries into the configuration which directly
> >> point to the directory above my data, but I can't add anything Andries
> >> comment.
> >>
> >>
> >>> 2. Large PCAP file
> >>> Query on very large PCAP file (larger than 100GB) failed with following
> >>> error message.
> >>>> Error: SYSTEM ERROR: IllegalStateException: Bad magic number =
> 0a0d0d0a
> >>>>
> >>>> Fragment 1:169
> >>>>
> >>>> [Error Id: 8882c359-c253-40c0-866c-417ef1ce5aa3 on node22:31010]
> >>> (state=,code=0)
> >>>
> >>> This happens even on Linux FS not MapR FS
> >>>
> >>
> >> Can you provide the stack trace from the Drillbit that hit the problem?
> >>
> >> I suspect that this has to do with splitting of the PCAP file.
> Normally, it
> >> is assumed that parallelism will be achieved by having lots of smaller
> >> files since it is difficult to jump into the middle of a PCAP file and
> get
> >> good results.
> >>
> >> Even if we disable splitting to avoid this error, you will have the
> >> complementary problem of slow queries due to single-threading. That
> doesn't
> >> seem very satisfactory either.
> >>
> >> A similar problem is that splitting a PCAP file pretty much requires a
> >> single-threaded read of the file in question. The read doesn't need to
> >> process very much data, but it does need to touch the whole file.
> >>
> >> Is it absolutely required to query large files like this? Would it be
> >> acceptable to split the file first by making a quick scan over it?
> >
> >
> > <sa1153582.zip>
>
> ———————————————————————
> <KDDI総合研究所 ビジョン>
> Challenge for the future 豊かな未来への挑戦
> ———————————————————————
> 英雄だけの夏。
> https://www.au.com/pr/cm/3taro/
> ———————————————————————
> 小河原 健生(Takeo Ogawara)
> (株)KDDI総合研究所
> コネクティッドカー1G
>
> TEL:049-278-7495 / 070-3623-9914
>
>
|