drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Takeo Ogawara <ta-ogaw...@kddi-research.jp>
Subject Re: Query Error on PCAP over MapR FS
Date Thu, 14 Sep 2017 06:20:08 GMT
Yes, that’s right.

[drill@node21 ~]$ ps -ef | grep Drillbit
drill      955     1  0 Sep13 ?        00:02:26 /usr/lib/jvm/jre-1.7.0-openjdk.x86_64/bin/java
-Xms4G -Xmx4G -XX:MaxDirectMemorySize=8G -XX:ReservedCodeCacheSize=1G -Ddrill.exec.enable-epoll=false
-XX:MaxPermSize=512M -XX:+CMSClassUnloadingEnabled -XX:+UseG1GC -Dlog.path=/home/drill/apache-drill-1.11.0/log/drillbit.log
-Dlog.query.path=/home/drill/apache-drill-1.11.0/log/drillbit_queries.json -cp /home/drill/apache-drill-1.11.0/conf:/home/drill/apache-drill-1.11.0/jars/*:/home/drill/apache-drill-1.11.0/jars/ext/*:/home/drill/apache-drill-1.11.0/jars/3rdparty/*:/home/drill/apache-drill-1.11.0/jars/classb/*
org.apache.drill.exec.server.Drillbit
drill    23618  4234  0 15:19 pts/4    00:00:00 grep Drillbit

Thank you.

> 2017/09/14 15:16、Robert Hou <rhou@mapr.com>のメール:
> 
> You wrote:
> 
> 
>   I meant I started Drill from Linux user “drill”.
> 
> 
> Do you mean that you logged in as user "drill" and started the drillbit?  Can you run:
> 
> 
>   ps -ef | grep Drillbit
> 
> 
> Thanks.
> 
> 
> --Robert
> 
> 
> ________________________________
> From: Takeo Ogawara <ta-ogawara@kddi-research.jp>
> Sent: Wednesday, September 13, 2017 10:57 PM
> To: user@drill.apache.org
> Subject: Re: Query Error on PCAP over MapR FS
> 
> I don’t specify the user name in sqlline command.
> I meant I started Drill from Linux user “drill”.
> [drill@node21 ~]$ ./apache-drill-1.11.0/bin/sqlline -u jdbc:drill:zk=node21:5181,node22:5181,node23:5181/drill/cluster3-drillbits
> apache drill 1.11.0
> "the only truly happy people are children, the creative minority and drill users"
> 0: jdbc:drill:zk=node21:5181,node22:5181,node> use dfs;
> +-------+----------------------------------+
> |  ok   |             summary              |
> +-------+----------------------------------+
> | true  | Default schema changed to [dfs]  |
> +-------+----------------------------------+
> 1 row selected (0.811 seconds)
> 0: jdbc:drill:zk=node21:5181,node22:5181,node> select * from `x.pcap`;
> Error: DATA_READ ERROR: /x.pcap (No such file or directory)
> 
> File name: /x.pcap
> Fragment 0:0
> 
> [Error Id: d6c1191a-ff79-4c39-96d3-0ae9e0be3ae9 on node25:31010] (state=,code=0)
> 0: jdbc:drill:zk=node21:5181,node22:5181,node> show files in  `x.pcap`;
> +---------+--------------+---------+---------+--------+--------+--------------+------------------------+-------------------------+
> |  name   | isDirectory  | isFile  | length  | owner  | group  | permissions  |     
 accessTime       |    modificationTime     |
> +---------+--------------+---------+---------+--------+--------+--------------+------------------------+-------------------------+
> | x.pcap  | false        | true    | 6083    | root   | root   | rw-r--r--    | 2017-09-13
16:14:52.0  | 2017-09-13 16:14:52.24  |
> +---------+--------------+---------+---------+--------+--------+--------------+------------------------+-------------------------+
> 1 row selected (0.241 seconds)
> 
> Drillbits config is as follows.
> drill.exec: {
>  cluster-id: "cluster3-drillbits",
>  zk.connect: "node21:5181,node22:5181,node23:5181”
> }
> 
> Storage plugin has config for PCAP.
>    "pcap": {
>      "type": "pcap"
>    },
> 
> Is it better to access via NFS to MapR FS?
> I can access file:///mapr/cluster3/x.pcap in Drill sqlline.
> 
> Thank you.
> 
>> 2017/09/14 14:27、Robert Hou <rhou@mapr.com>のメール:
>> 
>> You wrote:
>> 
>>  IӮm running drill as user ӡdrillӱ.
>> 
>> 
>> How are you invoking sqllline?  Are you specifying a user "drill"?
>> 
>> 
>> You should be able to query the file with two steps:
>> 
>> 
>> 1) use mfs;
>> 
>> 
>> this invokes the plugin
>> 
>> 
>> 2) select * from `x.pcap`;
>> 
>> 
>> Since x.pcap is in the root directory, you don't need to reference mfs again
>> 
>> 
>> 
>> Thanks.
>> 
>> --Robert
>> 
>> ________________________________
>> From: Takeo Ogawara <ta-ogawara@kddi-research.jp>
>> Sent: Wednesday, September 13, 2017 9:17 PM
>> To: user
>> Subject: Re: Query Error on PCAP over MapR FS
>> 
>> I used storage plugin named ”°mfs”± with ”°maprfs:///”°.
>> I modified plugin name from ”°mfs”± to ”°dfs”± and tested a query, but
the result was the same (No such file).
>> 
>> "Hadoop fs -ls / " can find x.pcap
>> [drill@node21 log]$ hadoop fs -ls / | grep x.pcap
>> -rw-r--r--   3 root root         6083 2017-09-13 16:14 /x.pcap
>> 
>> Show files in drill
>> 0: jdbc:drill:drillbit=localhost> show files in dfs.`x.pcap`;
>> +---------+--------------+---------+---------+--------+--------+--------------+------------------------+-------------------------+
>> |  name   | isDirectory  | isFile  | length  | owner  | group  | permissions  | 
     accessTime       |    modificationTime     |
>> +---------+--------------+---------+---------+--------+--------+--------------+------------------------+-------------------------+
>> | x.pcap  | false        | true    | 6083    | root   | root   | rw-r--r--    | 2017-09-13
16:14:52.0  | 2017-09-13 16:14:52.24  |
>> +---------+--------------+---------+---------+--------+--------+--------------+------------------------+-------------------------+
>> 1 row selected (0.328 seconds)
>> 
>> IӮm running drill as user ӡdrillӱ.
>> Is there something wrong with file permissions?
>> 
>> 
>>> For your testing you can just use the default root volume, but with MapR-FS it
is a good idea to create volumes for different data/use cases and then mount these volumes
on MapR-FS.
>>> This allows for benefits like topology, quota & security management; also
ease of use for enterprise features like mirroring, snapshots, etc in the future to name a
few.
>>> https://maprdocs.mapr.com/home/AdministratorGuide/c_managing_data_with_volumes.html
>> Managing Data with Volumes<https://maprdocs.mapr.com/home/AdministratorGuide/c_managing_data_with_volumes.html>
>> maprdocs.mapr.com
>> MapR provides volumes as a way to organize data and manage cluster performance. A
volume is a logical unit that allows you to apply policies to a set of files, directories,
and sub-volumes. A ...
>> 
>> 
>> 
>> 
>> Thank you for the information.
>> IӮll separate the volume for PCAP from other services.
>> 
>> Thank you.
>> 
>>> 2017/09/13 23:48”¢Andries Engelbrecht <aengelbrecht@mapr.com>¤Ī„į©`„ė:
>>> 
>>> Drill is not seeing the file in the location you pointed it.
>>> 
>>> What did you name the storage plugin?
>>> The default is normally dfs for the distributed filesystem.
>>> 
>>> Also did you place the file in the root directory of the dfs?
>>> What do you get back if you run Hadoop fs ØCls /
>>> 
>>> For your testing you can just use the default root volume, but with MapR-FS it
is a good idea to create volumes for different data/use cases and then mount these volumes
on MapR-FS.
>>> This allows for benefits like topology, quota & security management; also
ease of use for enterprise features like mirroring, snapshots, etc in the future to name a
few.
>>> https://maprdocs.mapr.com/home/AdministratorGuide/c_managing_data_with_volumes.html
>> Managing Data with Volumes<https://maprdocs.mapr.com/home/AdministratorGuide/c_managing_data_with_volumes.html>
>> maprdocs.mapr.com
>> MapR provides volumes as a way to organize data and manage cluster performance. A
volume is a logical unit that allows you to apply policies to a set of files, directories,
and sub-volumes. A ...
>> 
>> 
>> 
>>> 
>>> 
>>> --Andries
>>> 
>>> 
>>> On 9/13/17, 12:38 AM, "Takeo Ogawara" <ta-ogawara@kddi-research.jp> wrote:
>>> 
>>>  Hi,
>>> 
>>>  I modified storage config like this.
>>> 
>>>  "type": "file",
>>>   "enabled": true,
>>>   "connection": "maprfs:///",
>>>   "config": null,
>>>   "workspaces": {
>>>     "root": {
>>>       "location": "/",
>>>       "writable": false,
>>>       "defaultInputFormat": null
>>>     }
>>>   }
>>> 
>>>  But query like ”°select * from mfs.`x.pcap`”± failed.
>>>  Is there any other configuration I should modify?
>>> 
>>>  This is drillbit.log and it seems java.io.FileInputStream is going to open MapR
FS file path.
>>> 
>>>  Thank you.
>>> 
>>>  2017-09-13 16:20:06,123 [264723d8-bcba-6330-c9be-1c9c95dd2aa6:foreman] INFO
 o.a.d.c.s.persistence.ScanResult - loading 9 classes for org.apache.drill.exec.store.dfs.FormatPlugin
took 0ms
>>>  2017-09-13 16:20:06,124 [264723d8-bcba-6330-c9be-1c9c95dd2aa6:foreman] INFO
 o.a.d.c.s.persistence.ScanResult - loading 10 classes for org.apache.drill.common.logical.FormatPluginConfig
took 0ms
>>>  2017-09-13 16:20:06,124 [264723d8-bcba-6330-c9be-1c9c95dd2aa6:foreman] INFO
 o.a.d.c.s.persistence.ScanResult - loading 10 classes for org.apache.drill.common.logical.FormatPluginConfig
took 0ms
>>>  2017-09-13 16:20:06,125 [264723d8-bcba-6330-c9be-1c9c95dd2aa6:foreman] INFO
 o.a.d.c.s.persistence.ScanResult - loading 10 classes for org.apache.drill.common.logical.FormatPluginConfig
took 0ms
>>>  2017-09-13 16:20:06,145 [264723d8-bcba-6330-c9be-1c9c95dd2aa6:foreman] INFO
 o.a.d.c.s.persistence.ScanResult - loading 9 classes for org.apache.drill.exec.store.dfs.FormatPlugin
took 0ms
>>>  2017-09-13 16:20:06,145 [264723d8-bcba-6330-c9be-1c9c95dd2aa6:foreman] INFO
 o.a.d.c.s.persistence.ScanResult - loading 10 classes for org.apache.drill.common.logical.FormatPluginConfig
took 0ms
>>>  2017-09-13 16:20:06,146 [264723d8-bcba-6330-c9be-1c9c95dd2aa6:foreman] INFO
 o.a.d.c.s.persistence.ScanResult - loading 10 classes for org.apache.drill.common.logical.FormatPluginConfig
took 0ms
>>>  2017-09-13 16:20:06,170 [264723d8-bcba-6330-c9be-1c9c95dd2aa6:foreman] INFO
 o.a.d.exec.store.dfs.FileSelection - FileSelection.getStatuses() took 0 ms, numFiles: 1
>>>  2017-09-13 16:20:06,170 [264723d8-bcba-6330-c9be-1c9c95dd2aa6:foreman] INFO
 o.a.d.exec.store.dfs.FileSelection - FileSelection.getStatuses() took 0 ms, numFiles: 1
>>>  2017-09-13 16:20:06,178 [264723d8-bcba-6330-c9be-1c9c95dd2aa6:foreman] INFO
 o.a.d.exec.store.dfs.FileSelection - FileSelection.getStatuses() took 0 ms, numFiles: 1
>>>  2017-09-13 16:20:06,179 [264723d8-bcba-6330-c9be-1c9c95dd2aa6:foreman] INFO
 o.a.d.e.s.schedule.BlockMapBuilder - Get block maps: Executed 1 out of 1 using 1 threads.
Time: 0ms total, 0.847323ms avg, 0ms max.
>>>  2017-09-13 16:20:06,179 [264723d8-bcba-6330-c9be-1c9c95dd2aa6:foreman] INFO
 o.a.d.e.s.schedule.BlockMapBuilder - Get block maps: Executed 1 out of 1 using 1 threads.
Earliest start: 1.522000 ¦Ģs, Latest start: 1.522000 ¦Ģs, Average start: 1.522000 ¦Ģs
.
>>>  2017-09-13 16:20:06,199 [264723d8-bcba-6330-c9be-1c9c95dd2aa6:frag:0:0] INFO
 o.a.d.e.store.pcap.PcapRecordReader - User Error Occurred: /x.pcap (No such file or directory)
(/x.pcap (No such file or directory))
>>>  org.apache.drill.common.exceptions.UserException: DATA_READ ERROR: /x.pcap (No
such file or directory)
>>> 
>>>  File name: /x.pcap
>>> 
>>>  [Error Id: 48be766a-8706-407f-8dff-eb563271a4a3 ]
>>>      at org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:550)
~[drill-common-1.11.0.jar:1.11.0]
>>>      at org.apache.drill.exec.store.pcap.PcapRecordReader.setup(PcapRecordReader.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
>>>      at org.apache.drill.exec.physical.impl.ScanBatch.<init>(ScanBatch.java:104)
[drill-java-exec-1.11.0.jar:1.11.0]
>>>      at org.apache.drill.exec.store.dfs.easy.EasyFormatPlugin.getReaderBatch(EasyFormatPlugin.java:166)
[drill-java-exec-1.11.0.jar:1.11.0]
>>>      at org.apache.drill.exec.store.dfs.easy.EasyReaderBatchCreator.getBatch(EasyReaderBatchCreator.java:35)
[drill-java-exec-1.11.0.jar:1.11.0]
>>>      at org.apache.drill.exec.store.dfs.easy.EasyReaderBatchCreator.getBatch(EasyReaderBatchCreator.java:28)
[drill-java-exec-1.11.0.jar:1.11.0]
>>>      at org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:156)
[drill-java-exec-1.11.0.jar:1.11.0]
>>>      at org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:179)
[drill-java-exec-1.11.0.jar:1.11.0]
>>>      at org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:136)
[drill-java-exec-1.11.0.jar:1.11.0]
>>>      at org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:179)
[drill-java-exec-1.11.0.jar:1.11.0]
>>>      at org.apache.drill.exec.physical.impl.ImplCreator.getRootExec(ImplCreator.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
>>>      at org.apache.drill.exec.physical.impl.ImplCreator.getExec(ImplCreator.java:87)
[drill-java-exec-1.11.0.jar:1.11.0]
>>>      at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:207)
[drill-java-exec-1.11.0.jar:1.11.0]
>>>      at org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
[drill-common-1.11.0.jar:1.11.0]
>>>      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
[na:1.7.0_141]
>>>      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
[na:1.7.0_141]
>>>      at java.lang.Thread.run(Thread.java:748) [na:1.7.0_141]
>>>  Caused by: java.io.FileNotFoundException: /x.pcap (No such file or directory)
>>>      at java.io.FileInputStream.open(Native Method) ~[na:1.7.0_141]
>>>      at java.io.FileInputStream.<init>(FileInputStream.java:146) ~[na:1.7.0_141]
>>>      at java.io.FileInputStream.<init>(FileInputStream.java:101) ~[na:1.7.0_141]
>>>      at org.apache.drill.exec.store.pcap.PcapRecordReader.setup(PcapRecordReader.java:103)
[drill-java-exec-1.11.0.jar:1.11.0]
>>>      ... 15 common frames omitted
>>>  2017-09-13 16:20:06,199 [264723d8-bcba-6330-c9be-1c9c95dd2aa6:frag:0:0] INFO
 o.a.d.e.w.fragment.FragmentExecutor - 264723d8-bcba-6330-c9be-1c9c95dd2aa6:0:0: State change
requested AWAITING_ALLOCATION --> FAILED
>>>  2017-09-13 16:20:06,200 [264723d8-bcba-6330-c9be-1c9c95dd2aa6:frag:0:0] INFO
 o.a.d.e.w.fragment.FragmentExecutor - 264723d8-bcba-6330-c9be-1c9c95dd2aa6:0:0: State change
requested FAILED --> FINISHED
>>>  2017-09-13 16:20:06,213 [BitServer-4] WARN  o.a.drill.exec.work.foreman.Foreman
- Dropping request to move to COMPLETED state as query is already at FAILED state (which is
terminal).
>>>  2017-09-13 16:20:06,214 [BitServer-4] WARN  o.a.d.e.w.b.ControlMessageHandler
- Dropping request to cancel fragment. 264723d8-bcba-6330-c9be-1c9c95dd2aa6:0:0 does not exist.
>>> 
>>>  ”Ŗ
>>> 
>>>   97   @Override
>>>   98   public void setup(final OperatorContext context, final OutputMutator output)
throws ExecutionSetupException {
>>>   99     try {
>>>  100
>>>  101       this.output = output;
>>>  102       this.buffer = new byte[100000];
>>>  103       this.in = new FileInputStream(inputPath);
>>>  104       this.decoder = new PacketDecoder(in);
>>>  105       this.validBytes = in.read(buffer);
>>>  106       this.projectedCols = getProjectedColsIfItNull();
>>>  107       setColumns(projectedColumns);
>>>  108     } catch (IOException io) {
>>>  109       throw UserException.dataReadError(io)
>>>  110           .addContext("File name:", inputPath)
>>>  111           .build(logger);
>>>  112     }
>>>  113   }
>>> 
>>> 
>>>> 2017/09/11 23:16”¢Andries Engelbrecht <aengelbrecht@mapr.com>¤Ī„į©`„ė:
>>>> 
>>>> Typically when you use the MapR-FS plugin you donӮt need to specify the
cluster root path in the dfs workspace.
>>>> 
>>>> Instead of "location": "/mapr/cluster3",   use "location": "/",
>>>> 
>>>> "connection": "maprfs:///", already points to the default MapR cluster root.
>>>> 
>>>> --Andries
>>>> 
>>>> 
>>>> 
>>>> On 9/11/17, 2:23 AM, "Takeo Ogawara" <ta-ogawara@kddi-research.jp>
wrote:
>>>> 
>>>> Dear all,
>>>> 
>>>> IӮm using PCAP storage plugin over MapR FS(5.2.0) with Drill(1.11.0)
compiled as follows.
>>>> $ mvn clean install -DskipTests -Pmapr
>>>> 
>>>> Some queries caused errors as following.
>>>> Does anyone know how to solve these errors?
>>>> 
>>>> 1. Query error when cluster-name is not specified
>>>> Storage ”°mfs”± setting is this.
>>>> 
>>>>> "type": "file",
>>>>> "enabled": true,
>>>>> "connection": "maprfs:///",
>>>>> "config": null,
>>>>> "workspaces": {
>>>>> "root": {
>>>>>   "location": "/mapr/cluster3",
>>>>>   "writable": false,
>>>>>   "defaultInputFormat": null
>>>>> }
>>>>> }
>>>> 
>>>> 
>>>> With this setting, the following query failed.
>>>>> select * from mfs.`x.pcap` ;
>>>>> Error: DATA_READ ERROR: /x.pcap (No such file or directory)
>>>>> 
>>>>> File name: /x.pcap
>>>>> Fragment 0:0
>>>>> 
>>>>> [Error Id: 70b73062-c3ed-4a10-9a88-034b4e6d039a on node21:31010] (state=,code=0)
>>>> 
>>>> But these queries passed.
>>>>> select * from mfs.root.`x.pcap` ;
>>>>> select * from mfs.`x.csv`;
>>>>> select * from mfs.root.`x.csv`;
>>>> 
>>>> 2. Large PCAP file
>>>> Query on very large PCAP file (larger than 100GB) failed with following error
message.
>>>>> Error: SYSTEM ERROR: IllegalStateException: Bad magic number = 0a0d0d0a
>>>>> 
>>>>> Fragment 1:169
>>>>> 
>>>>> [Error Id: 8882c359-c253-40c0-866c-417ef1ce5aa3 on node22:31010] (state=,code=0)
>>>> 
>>>> This happens even on Linux FS not MapR FS
>>>> 
>>>> Thank you.
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>> 
>>>  ”Ŗ”Ŗ”Ŗ”Ŗ”Ŗ”Ŗ”Ŗ”Ŗ”Ŗ”Ŗ”Ŗ”Ŗ”Ŗ”Ŗ”Ŗ”Ŗ”Ŗ”Ŗ”Ŗ”Ŗ”Ŗ”Ŗ”Ŗ
>>>       ”””” <KDDI¾tŗĻŃŠ¾æĖł””„Ó„ø„ē„ó>
>>>  ””Challenge for the future ŲN¤«¤ŹĪ“Ą“¤Ų¤ĪĢō‘é
>>>  ”Ŗ”Ŗ”Ŗ”Ŗ”Ŗ”Ŗ”Ŗ”Ŗ”Ŗ”Ŗ”Ŗ”Ŗ”Ŗ”Ŗ”Ŗ”Ŗ”Ŗ”Ŗ”Ŗ”Ŗ”Ŗ”Ŗ”Ŗ
>>>    ””           Ó¢ŠŪ¤Ą¤±¤ĪĻÄ”£
>>>       ””https://www.au.com/pr/cm/3taro/
>>>  ”Ŗ”Ŗ”Ŗ”Ŗ”Ŗ”Ŗ”Ŗ”Ŗ”Ŗ”Ŗ”Ŗ”Ŗ”Ŗ”Ŗ”Ŗ”Ŗ”Ŗ”Ŗ”Ŗ”Ŗ”Ŗ”Ŗ”Ŗ
>>>  Š”ŗÓŌ­””½”Éś£ØTakeo Ogawara£©
>>>  (Öź)KDDI¾tŗĻŃŠ¾æĖł
>>>  „³„Ķ„Æ„Ę„£„Ć„É„«©`1G
>>> 
>>>  TEL:049-278-7495 / 070-3623-9914
> 
> ———————————————————————
>        <KDDI総合研究所 ビジョン>
>  Challenge for the future 豊かな未来への挑戦
> ———————————————————————
>              英雄だけの夏。
>      https://www.au.com/pr/cm/3taro/
> ———————————————————————
> 小河原 健生(Takeo Ogawara)
> (株)KDDI総合研究所
> コネクティッドカー1G
> 
> TEL:049-278-7495 / 070-3623-9914
> 

———————————————————————
        <KDDI総合研究所 ビジョン>
 Challenge for the future 豊かな未来への挑戦
———————————————————————
              英雄だけの夏。
      https://www.au.com/pr/cm/3taro/
———————————————————————
小河原 健生(Takeo Ogawara)
(株)KDDI総合研究所
コネクティッドカー1G

TEL:049-278-7495 / 070-3623-9914


Mime
View raw message