drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Parth Chandra <par...@apache.org>
Subject Re: Exception while reading parquet data
Date Mon, 16 Oct 2017 18:53:54 GMT
Hi Projjwal,

  Unfortunately, I did not get a crash when I tried with your sample file.
Also if turning off buffered reader did not help, did you get a different
stack trace?

  Any more information you can provide will be useful. Is this part of a
larger query with more parquet files being read? Are you reading all the
columns? Is there some specific column that appears to trigger the issue?

  You can mail this info directly to me if you are not comfortable sharing
your data on the public list.

Thanks

Parth


On Mon, Oct 16, 2017 at 8:19 AM, PROJJWAL SAHA <proj.saha@gmail.com> wrote:

> here is the link for the parquet data.
> https://drive.google.com/file/d/0BzZhvMHOeao1S2Rud2xDS1NyS00/
> view?usp=sharing
>
> Setting store.parquet.reader.pagereader.bufferedread=false did not solve
> the issue.
>
> I am using Drill 1.11. The parquet data is fetched from Oracle Storage
> Cloud Service using swift driver.
>
> Here is the error on the drill command prompt -
> Error: DATA_READ ERROR: Exception occurred while reading from disk.
>
> File:
> /data1GBparquet/storereturns/part-00000-7ce26fde-f342-4aae-
> a727-71b8b7a60e63.parquet
> Column:  sr_return_time_sk
> Row Group Start:  417866
> File:
> /data1GBparquet/storereturns/part-00000-7ce26fde-f342-4aae-
> a727-71b8b7a60e63.parquet
> Column:  sr_return_time_sk
> Row Group Start:  417866
> Fragment 0:0
>
> On Sun, Oct 15, 2017 at 8:59 PM, Kunal Khatua <kkhatua@mapr.com> wrote:
>
> > You could try uploading to Google Drive (since you have a Gmail account)
> > and share the link .
> >
> > Did Parth's suggestion of
> > store.parquet.reader.pagereader.bufferedread=false
> > resolve the issue?
> >
> > Also share the details of the hardware setup... #nodes, Hadoop version,
> > etc.
> >
> >
> > -----Original Message-----
> > From: PROJJWAL SAHA [mailto:proj.saha@gmail.com]
> > Sent: Sunday, October 15, 2017 8:07 AM
> > To: user@drill.apache.org
> > Subject: Re: Exception while reading parquet data
> >
> > Is there any place where I can upload the 12MB parquet data. I am not
> able
> > to send the file through mail to the user group.
> >
> > On Thu, Oct 12, 2017 at 10:58 PM, Parth Chandra <parthc@apache.org>
> wrote:
> >
> > > Seems like a bug in BufferedDirectBufInputStream.  Is it possible to
> > > share a minimal data file that triggers this?
> > >
> > > You can also try turning off the buffering reader.
> > >    store.parquet.reader.pagereader.bufferedread=false
> > >
> > > With async reader on and buffering off, you might not see any
> > > degradation in performance in most cases.
> > >
> > >
> > >
> > > On Thu, Oct 12, 2017 at 2:08 AM, PROJJWAL SAHA <proj.saha@gmail.com>
> > > wrote:
> > >
> > > > hi,
> > > >
> > > > disabling sync parquet reader doesnt solve the problem. I am getting
> > > > similar exception I dont see any issue with the parquet file since
> > > > the same file works on loading the same on alluxio.
> > > >
> > > > 2017-10-12 04:19:50,502
> > > > [2620da63-4efb-47e2-5e2c-29b48c0194c0:frag:0:0] ERROR
> > > > o.a.d.e.u.f.BufferedDirectBufInputStream - Error reading from stream
> > > > part-00000-7ce26fde-f342-4aae-a727-71b8b7a60e63.parquet. Error was :
> > > > null
> > > > 2017-10-12 04:19:50,506
> > > > [2620da63-4efb-47e2-5e2c-29b48c0194c0:frag:0:0] ERROR
> > > > o.a.d.exec.physical.impl.ScanBatch - SYSTEM ERROR:
> > > > IndexOutOfBoundsException
> > > >
> > > >
> > > > [Error Id: 3b7c4587-c1b8-4e79-bdaa-b2aa1516275b ]
> > > > org.apache.drill.common.exceptions.UserException: SYSTEM ERROR:
> > > > IndexOutOfBoundsException
> > > >
> > > >
> > > > [Error Id: 3b7c4587-c1b8-4e79-bdaa-b2aa1516275b ]
> > > >         at org.apache.drill.common.exceptions.UserException$
> > > > Builder.build(UserException.java:550)
> > > > ~[drill-common-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.physical.impl.ScanBatch.next(
> > > > ScanBatch.java:249)
> > > > [drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > > AbstractRecordBatch.java:119)
> > > > [drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > > AbstractRecordBatch.java:109)
> > > > [drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.record.AbstractSingleRecordBatch.
> > > > innerNext(AbstractSingleRecordBatch.java:51)
> > > > [drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > > AbstractRecordBatch.java:162)
> > > > [drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > > AbstractRecordBatch.java:119)
> > > > [drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > > AbstractRecordBatch.java:109)
> > > > [drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.record.AbstractSingleRecordBatch.
> > > > innerNext(AbstractSingleRecordBatch.java:51)
> > > > [drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.physical.impl.svremover.
> > > > RemovingRecordBatch.innerNext(RemovingRecordBatch.java:93)
> > > > [drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > > AbstractRecordBatch.java:162)
> > > > [drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > > AbstractRecordBatch.java:119)
> > > > [drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > > AbstractRecordBatch.java:109)
> > > > [drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.record.AbstractSingleRecordBatch.
> > > > innerNext(AbstractSingleRecordBatch.java:51)
> > > > [drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.physical.impl.project.
> > > > ProjectRecordBatch.innerNext(ProjectRecordBatch.java:133)
> > > > [drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > > AbstractRecordBatch.java:162)
> > > > [drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > > AbstractRecordBatch.java:119)
> > > > [drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > > AbstractRecordBatch.java:109)
> > > > [drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.physical.impl.aggregate.
> > > > HashAggBatch.buildSchema(HashAggBatch.java:111)
> > > > [drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > > AbstractRecordBatch.java:142)
> > > > [drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > > AbstractRecordBatch.java:119)
> > > > [drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > > AbstractRecordBatch.java:109)
> > > > [drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.physical.impl.xsort.
> > > > ExternalSortBatch.buildSchema(ExternalSortBatch.java:264)
> > > > [drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > > AbstractRecordBatch.java:142)
> > > > [drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > > AbstractRecordBatch.java:119)
> > > > [drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > > AbstractRecordBatch.java:109)
> > > > [drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.record.AbstractSingleRecordBatch.
> > > > innerNext(AbstractSingleRecordBatch.java:51)
> > > > [drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.physical.impl.svremover.
> > > > RemovingRecordBatch.innerNext(RemovingRecordBatch.java:93)
> > > > [drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > > AbstractRecordBatch.java:162)
> > > > [drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > > AbstractRecordBatch.java:119)
> > > > [drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > > AbstractRecordBatch.java:109)
> > > > [drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.record.AbstractSingleRecordBatch.
> > > > innerNext(AbstractSingleRecordBatch.java:51)
> > > > [drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.physical.impl.project.
> > > > ProjectRecordBatch.innerNext(ProjectRecordBatch.java:133)
> > > > [drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > > AbstractRecordBatch.java:162)
> > > > [drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > > AbstractRecordBatch.java:119)
> > > > [drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > > AbstractRecordBatch.java:109)
> > > > [drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.record.AbstractSingleRecordBatch.
> > > > innerNext(AbstractSingleRecordBatch.java:51)
> > > > [drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.physical.impl.limit.
> > > > LimitRecordBatch.innerNext(LimitRecordBatch.java:115)
> > > > [drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > > AbstractRecordBatch.java:162)
> > > > [drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > > AbstractRecordBatch.java:119)
> > > > [drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > > AbstractRecordBatch.java:109)
> > > > [drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.record.AbstractSingleRecordBatch.
> > > > innerNext(AbstractSingleRecordBatch.java:51)
> > > > [drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.physical.impl.svremover.
> > > > RemovingRecordBatch.innerNext(RemovingRecordBatch.java:93)
> > > > [drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > > AbstractRecordBatch.java:162)
> > > > [drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > > AbstractRecordBatch.java:119)
> > > > [drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > > AbstractRecordBatch.java:109)
> > > > [drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.record.AbstractSingleRecordBatch.
> > > > innerNext(AbstractSingleRecordBatch.java:51)
> > > > [drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.physical.impl.project.
> > > > ProjectRecordBatch.innerNext(ProjectRecordBatch.java:133)
> > > > [drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > > AbstractRecordBatch.java:162)
> > > > [drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.physical.impl.BaseRootExec.
> > > > next(BaseRootExec.java:105)
> > > > [drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.physical.impl.ScreenCreator$
> > > > ScreenRoot.innerNext(ScreenCreator.java:81)
> > > > [drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.physical.impl.BaseRootExec.
> > > > next(BaseRootExec.java:95)
> > > > [drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.work.fragment.FragmentExecutor$1.
> > > > run(FragmentExecutor.java:234)
> > > > [drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.work.fragment.FragmentExecutor$1.
> > > > run(FragmentExecutor.java:227)
> > > > [drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at java.security.AccessController.doPrivileged(Native
> > > > Method) [na:1.8.0_121]
> > > >         at javax.security.auth.Subject.doAs(Subject.java:422)
> > > > [na:1.8.0_121]
> > > >         at org.apache.hadoop.security.UserGroupInformation.doAs(
> > > > UserGroupInformation.java:1657)
> > > > [hadoop-common-2.7.1.jar:na]
> > > >         at org.apache.drill.exec.work.fragment.FragmentExecutor.run(
> > > > FragmentExecutor.java:227)
> > > > [drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.common.SelfCleaningRunnable.run(
> > > > SelfCleaningRunnable.java:38)
> > > > [drill-common-1.11.0.jar:1.11.0]
> > > >         at java.util.concurrent.ThreadPoolExecutor.runWorker(
> > > > ThreadPoolExecutor.java:1142)
> > > > [na:1.8.0_121]
> > > >         at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> > > > ThreadPoolExecutor.java:617)
> > > > [na:1.8.0_121]
> > > >         at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121]
> > > > Caused by: org.apache.drill.common.exceptions.DrillRuntimeException:
> > > > Error in parquet record reader.
> > > > Message:
> > > > Hadoop path:
> > > > /data1GBparquet/storereturns/part-00000-7ce26fde-f342-4aae-
> > > > a727-71b8b7a60e63.parquet
> > > > Total records read: 0
> > > > Row group index: 0
> > > > Records in row group: 287514
> > > > Parquet Metadata: ParquetMetaData{FileMetaData{schema: message
> > > > spark_schema {
> > > >   optional int32 sr_returned_date_sk;
> > > >   optional int32 sr_return_time_sk;
> > > >   optional int32 sr_item_sk;
> > > >   optional int32 sr_customer_sk;
> > > >   optional int32 sr_cdemo_sk;
> > > >   optional int32 sr_hdemo_sk;
> > > >   optional int32 sr_addr_sk;
> > > >   optional int32 sr_store_sk;
> > > >   optional int32 sr_reason_sk;
> > > >   optional int32 sr_ticket_number;
> > > >   optional int32 sr_return_quantity;
> > > >   optional double sr_return_amt;
> > > >   optional double sr_return_tax;
> > > >   optional double sr_return_amt_inc_tax;
> > > >   optional double sr_fee;
> > > >   optional double sr_return_ship_cost;
> > > >   optional double sr_refunded_cash;
> > > >   optional double sr_reversed_charge;
> > > >   optional double sr_store_credit;
> > > >   optional double sr_net_loss;
> > > >   optional binary sr_dummycol (UTF8); } , metadata:
> > > > {org.apache.spark.sql.parquet.row.metadata={"type":"struct",
> > > > "fields":[{"name":"sr_returned_date_sk","type":"
> > > integer","nullable":true,"
> > > > metadata":{}},{"name":"sr_return_time_sk","type":"
> > > > integer","nullable":true,"metadata":{}},{"name":"sr_
> > > > item_sk","type":"integer","nullable":true,"metadata":{}},
> > > > {"name":"sr_customer_sk","type":"integer","nullable":
> > > > true,"metadata":{}},{"name":"sr_cdemo_sk","type":"integer",
> > > > "nullable":true,"metadata":{}},{"name":"sr_hdemo_sk","type":
> > > > "integer","nullable":true,"metadata":{}},{"name":"sr_
> > > > addr_sk","type":"integer","nullable":true,"metadata":{}},
> > > > {"name":"sr_store_sk","type":"integer","nullable":true,"
> > > > metadata":{}},{"name":"sr_reason_sk","type":"integer","
> > > > nullable":true,"metadata":{}},{"name":"sr_ticket_number","
> > > > type":"integer","nullable":true,"metadata":{}},{"name":"
> > > > sr_return_quantity","type":"integer","nullable":true,"
> > > > metadata":{}},{"name":"sr_return_amt","type":"double","
> > > > nullable":true,"metadata":{}},{"name":"sr_return_tax","type"
> > > > :"double","nullable":true,"metadata":{}},{"name":"sr_
> > > > return_amt_inc_tax","type":"double","nullable":true,"
> > > > metadata":{}},{"name":"sr_fee","type":"double","nullable":
> > > > true,"metadata":{}},{"name":"sr_return_ship_cost","type":"
> > > > double","nullable":true,"metadata":{}},{"name":"sr_
> > > > refunded_cash","type":"double","nullable":true,"metadata":{}
> > > > },{"name":"sr_reversed_charge","type":"double","nullable":
> > > > true,"metadata":{}},{"name":"sr_store_credit","type":"
> > > > double","nullable":true,"metadata":{}},{"name":"sr_net_
> > > > loss","type":"double","nullable":true,"metadata":{}},
> > > > {"name":"sr_dummycol","type":"string","nullable":true,"
> > > metadata":{}}]}}},
> > > > blocks: [BlockMetaData{287514, 18570101 [ColumnMetaData{UNCOMPRESSED
> > > > [sr_returned_date_sk] INT32  [RLE, PLAIN_DICTIONARY, BIT_PACKED],
> > > > 4}, ColumnMetaData{UNCOMPRESSED [sr_return_time_sk] INT32  [RLE,
> > > > PLAIN_DICTIONARY, BIT_PACKED], 417866}, ColumnMetaData{UNCOMPRESSED
> > > > [sr_item_sk] INT32  [RLE, PLAIN_DICTIONARY, BIT_PACKED], 1096347},
> > > > ColumnMetaData{UNCOMPRESSED [sr_customer_sk] INT32  [RLE,
> > > > PLAIN_DICTIONARY, BIT_PACKED], 1708118}, ColumnMetaData{UNCOMPRESSED
> > > > [sr_cdemo_sk] INT32  [RLE, PLAIN, BIT_PACKED], 2674001},
> > > > ColumnMetaData{UNCOMPRESSED [sr_hdemo_sk] INT32  [RLE,
> > > > PLAIN_DICTIONARY, BIT_PACKED], 3812205}, ColumnMetaData{UNCOMPRESSED
> > > > [sr_addr_sk] INT32  [RLE, PLAIN_DICTIONARY, BIT_PACKED], 4320246},
> > > > ColumnMetaData{UNCOMPRESSED [sr_store_sk] INT32  [RLE,
> > > > PLAIN_DICTIONARY, BIT_PACKED], 5102635}, ColumnMetaData{UNCOMPRESSED
> > > > [sr_reason_sk] INT32  [RLE, PLAIN_DICTIONARY, BIT_PACKED], 5235151},
> > > > ColumnMetaData{UNCOMPRESSED [sr_ticket_number] INT32  [RLE, PLAIN,
> > > > BIT_PACKED], 5471579}, ColumnMetaData{UNCOMPRESSED
> > > > [sr_return_quantity] INT32  [RLE, PLAIN_DICTIONARY, BIT_PACKED],
> > > > 6621731}, ColumnMetaData{UNCOMPRESSED [sr_return_amt] DOUBLE  [RLE,
> > > > PLAIN_DICTIONARY, BIT_PACKED], 6893357}, ColumnMetaData{UNCOMPRESSED
> > > > [sr_return_tax] DOUBLE  [RLE, PLAIN_DICTIONARY, BIT_PACKED],
> > > > 8419465}, ColumnMetaData{UNCOMPRESSED [sr_return_amt_inc_tax] DOUBLE
> > > > [RLE, PLAIN, PLAIN_DICTIONARY, BIT_PACKED], 9201856},
> > > > ColumnMetaData{UNCOMPRESSED [sr_fee] DOUBLE  [RLE, PLAIN_DICTIONARY,
> > > > BIT_PACKED], 11366007}, ColumnMetaData{UNCOMPRESSED
> > > > [sr_return_ship_cost] DOUBLE  [RLE, PLAIN_DICTIONARY, BIT_PACKED],
> > > > 11959880}, ColumnMetaData{UNCOMPRESSED [sr_refunded_cash] DOUBLE
> > > > [RLE, PLAIN_DICTIONARY, BIT_PACKED], 13218730},
> > > > ColumnMetaData{UNCOMPRESSED [sr_reversed_charge] DOUBLE  [RLE,
> > > > PLAIN_DICTIONARY, BIT_PACKED], 14635937},
> > > > ColumnMetaData{UNCOMPRESSED [sr_store_credit] DOUBLE  [RLE,
> > > > PLAIN_DICTIONARY, BIT_PACKED], 15824898},
> > > > ColumnMetaData{UNCOMPRESSED [sr_net_loss] DOUBLE  [RLE,
> > > > PLAIN_DICTIONARY, BIT_PACKED], 17004301}, ColumnMetaData{UNCOMPRESSED
> > [sr_dummycol] BINARY  [RLE, PLAIN, BIT_PACKED], 18570072}]}]}
> > > >         at org.apache.drill.exec.store.parquet.columnreaders.
> > > > ParquetRecordReader.handleException(ParquetRecordReader.java:272)
> > > > ~[drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.store.parquet.columnreaders.
> > > > ParquetRecordReader.next(ParquetRecordReader.java:299)
> > > > ~[drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.physical.impl.ScanBatch.next(
> > > > ScanBatch.java:180)
> > > > [drill-java-exec-1.11.0.jar:1.11.0]
> > > >         ... 60 common frames omitted Caused by: java.io.IOException:
> > > > java.lang.IndexOutOfBoundsException
> > > >         at org.apache.drill.exec.util.filereader.
> > > > BufferedDirectBufInputStream.getNextBlock(
> > BufferedDirectBufInputStream.
> > > > java:185)
> > > > ~[drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.util.filereader.
> > > > BufferedDirectBufInputStream.readInternal(
> > BufferedDirectBufInputStream.
> > > > java:212)
> > > > ~[drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.util.filereader.
> > > > BufferedDirectBufInputStream.read(BufferedDirectBufInputStream.java:
> > > > 277) ~[drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.util.filereader.
> > > > DirectBufInputStream.getNext(DirectBufInputStream.java:111)
> > > > ~[drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.store.parquet.columnreaders.
> > > > PageReader.readPage(PageReader.java:216)
> > > > ~[drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.store.parquet.columnreaders.
> > > > PageReader.nextInternal(PageReader.java:283)
> > > > ~[drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.store.parquet.columnreaders.
> > > > PageReader.next(PageReader.java:307)
> > > > ~[drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.store.parquet.columnreaders.
> > > > NullableColumnReader.processPages(NullableColumnReader.java:69)
> > > > ~[drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.store.parquet.columnreaders.
> > > BatchReader.
> > > > readAllFixedFieldsSerial(BatchReader.java:63)
> > > > ~[drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.store.parquet.columnreaders.
> > > BatchReader.
> > > > readAllFixedFields(BatchReader.java:56)
> > > > ~[drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.store.parquet.columnreaders.
> > > > BatchReader$FixedWidthReader.readRecords(BatchReader.java:143)
> > > > ~[drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.store.parquet.columnreaders.
> > > > BatchReader.readBatch(BatchReader.java:42)
> > > > ~[drill-java-exec-1.11.0.jar:1.11.0]
> > > >         at org.apache.drill.exec.store.parquet.columnreaders.
> > > > ParquetRecordReader.next(ParquetRecordReader.java:297)
> > > > ~[drill-java-exec-1.11.0.jar:1.11.0]
> > > >         ... 61 common frames omitted Caused by:
> > > > java.lang.IndexOutOfBoundsException: null
> > > >         at java.nio.Buffer.checkBounds(Buffer.java:567)
> > ~[na:1.8.0_121]
> > > >         at java.nio.ByteBuffer.put(ByteBuffer.java:827)
> > ~[na:1.8.0_121]
> > > >         at java.nio.DirectByteBuffer.put(DirectByteBuffer.java:379)
> > > > ~[na:1.8.0_121]
> > > >         at org.apache.parquet.hadoop.util.CompatibilityUtil.getBuf(
> > > > CompatibilityUtil.java:110)
> > > > ~[parquet-hadoop-1.8.1-drill-r0.jar:1.8.1-drill-r0]
> > > >         at org.apache.drill.exec.util.filereader.
> > > > BufferedDirectBufInputStream.getNextBlock(
> > BufferedDirectBufInputStream.
> > > > java:182)
> > > > ~[drill-java-exec-1.11.0.jar:1.11.0]
> > > >         ... 73 common frames omitted
> > > > 2017-10-12 04:19:50,506
> > > > [2620da63-4efb-47e2-5e2c-29b48c0194c0:frag:0:0] INFO
> > > > o.a.d.e.w.fragment.FragmentExecutor -
> > > > 2620da63-4efb-47e2-5e2c-29b48c0194c0:0:0: State change requested
> > > > RUNNING --> FAILED
> > > > 2017-10-12 04:19:50,507
> > > > [2620da63-4efb-47e2-5e2c-29b48c0194c0:frag:0:0] INFO
> > > > o.a.d.e.w.fragment.FragmentExecutor -
> > > > 2620da63-4efb-47e2-5e2c-29b48c0194c0:0:0: State change requested
> > > > FAILED --> FINISHED
> > > > 2017-10-12 04:19:50,533 [BitServer-2] WARN
> > > > o.a.drill.exec.work.foreman.Foreman - Dropping request to move to
> > > > COMPLETED state as query is already at FAILED state (which is
> > > > terminal).
> > > > 2017-10-12 04:19:50,533 [BitServer-2] WARN
> > > > o.a.d.e.w.b.ControlMessageHandler - Dropping request to cancel
> > > > fragment. 2620da63-4efb-47e2-5e2c-29b48c0194c0:0:0 does not exist.
> > > >
> > > >
> > > >
> > > > On Thu, Oct 12, 2017 at 1:49 PM, PROJJWAL SAHA <proj.saha@gmail.com>
> > > > wrote:
> > > >
> > > > > sure, I can try disabling sync parquet reader.
> > > > > Will this however, impact the performance of queries on parquet
> data
> > ?
> > > > >
> > > > > On Thu, Oct 12, 2017 at 9:39 AM, Kunal Khatua <kkhatua@mapr.com>
> > > wrote:
> > > > >
> > > > >> If this resolves the issue, could you share some additional
> > > > >> details,
> > > > such
> > > > >> as the metadata of the Parquet files, the OS, etc.? Details
> > > > >> describing
> > > > the
> > > > >> setup is also very helpful in identifying what could be the cause
> > > > >> of
> > > the
> > > > >> error.
> > > > >>
> > > > >> We had observed some similar DATA_READ errors in the early
> > > > >> iterations
> > > of
> > > > >> the Async Parquet reader, but those have been resolved. I'm
> > > > >> presuming you're already on the latest (i.e. Apache Drill 1.11.0)
> > > > >>
> > > > >> -----Original Message-----
> > > > >> From: Arjun kr [mailto:arjun.kr@outlook.com]
> > > > >> Sent: Wednesday, October 11, 2017 6:52 PM
> > > > >> To: user@drill.apache.org
> > > > >> Subject: Re: Exception while reading parquet data
> > > > >>
> > > > >>
> > > > >> Can you try disabling async parquet reader to see if problem
gets
> > > > >> resolved.
> > > > >>
> > > > >>
> > > > >> alter session set `store.parquet.reader.pagereader.async`=false;
> > > > >>
> > > > >> Thanks,
> > > > >>
> > > > >> Arjun
> > > > >>
> > > > >>
> > > > >> ________________________________
> > > > >> From: PROJJWAL SAHA <proj.saha@gmail.com>
> > > > >> Sent: Wednesday, October 11, 2017 2:20 PM
> > > > >> To: user@drill.apache.org
> > > > >> Subject: Exception while reading parquet data
> > > > >>
> > > > >> I get below exception when querying parquet data on Oracle
> > > > >> Storage
> > > Cloud
> > > > >> service.
> > > > >> Any pointers on what does this point to ?
> > > > >>
> > > > >> Regards,
> > > > >> Projjwal
> > > > >>
> > > > >>
> > > > >> ERROR o.a.d.e.u.f.BufferedDirectBufInputStream - Error reading
> > > > >> from stream
> > > > >> part-00006-25a9ae4b-fd9e-4770-b17e-9a29b270a4c2.parquet. Error
> > > > >> was : null
> > > > >> 2017-10-09 09:42:18,516 [scan-2] INFO
> > > > >> o.a.d.e.s.p.c.AsyncPageReader - User Error Occurred: Exception
> > occurred while reading from disk.
> > > > >> (java.lang.IndexOutOfBoundsException)
> > > > >> org.apache.drill.common.exceptions.UserException: DATA_READ
> ERROR:
> > > > >> Exception occurred while reading from disk.
> > > > >>
> > > > >> File:
> > > > >> /data25GB/storereturns/part-00006-25a9ae4b-fd9e-4770-b17e-9a
> > > > >> 29b270a4c2.parquet
> > > > >> Column:  sr_return_time_sk
> > > > >> Row Group Start:  479751
> > > > >>
> > > > >> [Error Id: 10680bb8-d1d6-43a1-b5e0-ef15bd8a9406 ] at
> > > > >> org.apache.drill.common.exceptions.UserException$Builder.
> > > > >> build(UserException.java:550)
> > > > >> ~[drill-common-1.11.0.jar:1.11.0] at
> > > > >> org.apache.drill.exec.store.parquet.columnreaders.AsyncPageR
> > > > >> eader.handleAndThrowException(AsyncPageReader.java:185)
> > > > >> [drill-java-exec-1.11.0.jar:1.11.0]
> > > > >> at
> > > > >> org.apache.drill.exec.store.parquet.columnreaders.AsyncPageR
> > > > >> eader.access$700(AsyncPageReader.java:82)
> > > > >> [drill-java-exec-1.11.0.jar:1.11.0]
> > > > >> at
> > > > >> org.apache.drill.exec.store.parquet.columnreaders.AsyncPageR
> > > > >> eader$AsyncPageReaderTask.call(AsyncPageReader.java:461)
> > > > >> [drill-java-exec-1.11.0.jar:1.11.0]
> > > > >> at
> > > > >> org.apache.drill.exec.store.parquet.columnreaders.AsyncPageR
> > > > >> eader$AsyncPageReaderTask.call(AsyncPageReader.java:381)
> > > > >> [drill-java-exec-1.11.0.jar:1.11.0]
> > > > >> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> > > > >> [na:1.8.0_121] at
> > > > >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
> > > > >> Executor.java:1142)
> > > > >> [na:1.8.0_121]
> > > > >> at
> > > > >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
> > > > >> lExecutor.java:617)
> > > > >> [na:1.8.0_121]
> > > > >> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121] Caused
> by:
> > > > >> java.io.IOException: java.lang.IndexOutOfBoundsException
> > > > >> at
> > > > >> org.apache.drill.exec.util.filereader.BufferedDirectBufInput
> > > > >> Stream.getNextBlock(BufferedDirectBufInputStream.java:185)
> > > > >> ~[drill-java-exec-1.11.0.jar:1.11.0]
> > > > >> at
> > > > >> org.apache.drill.exec.util.filereader.BufferedDirectBufInput
> > > > >> Stream.readInternal(BufferedDirectBufInputStream.java:212)
> > > > >> ~[drill-java-exec-1.11.0.jar:1.11.0]
> > > > >> at
> > > > >> org.apache.drill.exec.util.filereader.BufferedDirectBufInput
> > > > >> Stream.read(BufferedDirectBufInputStream.java:277)
> > > > >> ~[drill-java-exec-1.11.0.jar:1.11.0]
> > > > >> at
> > > > >> org.apache.drill.exec.util.filereader.DirectBufInputStream.
> > > > >> getNext(DirectBufInputStream.java:111)
> > > > >> ~[drill-java-exec-1.11.0.jar:1.11.0]
> > > > >> at
> > > > >> org.apache.drill.exec.store.parquet.columnreaders.AsyncPageR
> > > > >> eader$AsyncPageReaderTask.call(AsyncPageReader.java:421)
> > > > >> [drill-java-exec-1.11.0.jar:1.11.0]
> > > > >> ... 5 common frames omitted
> > > > >> Caused by: java.lang.IndexOutOfBoundsException: null at
> > > > >> java.nio.Buffer.checkBounds(Buffer.java:567) ~[na:1.8.0_121]
at
> > > > >> java.nio.ByteBuffer.put(ByteBuffer.java:827) ~[na:1.8.0_121]
at
> > > > >> java.nio.DirectByteBuffer.put(DirectByteBuffer.java:379)
> > > > ~[na:1.8.0_121]
> > > > >> at
> > > > >> org.apache.parquet.hadoop.util.CompatibilityUtil.getBuf(Comp
> > > > >> atibilityUtil.java:110)
> > > > >> ~[parquet-hadoop-1.8.1-drill-r0.jar:1.8.1-drill-r0]
> > > > >> at
> > > > >> org.apache.drill.exec.util.filereader.BufferedDirectBufInput
> > > > >> Stream.getNextBlock(BufferedDirectBufInputStream.java:182)
> > > > >> ~[drill-java-exec-1.11.0.jar:1.11.0]
> > > > >> ... 9 common frames omitted
> > > > >> 2017-10-09 09:42:20,533 [26248359-2fc8-d177-c3a6-
> > > 507f6857e0ea:frag:2:3]
> > > > >> INFO  o.a.d.e.w.fragment.FragmentExecutor -
> > > > >> 26248359-2fc8-d177-c3a6-507f6857e0ea:2:3: State change requested
> > > > >> AWAITING_ALLOCATION --> RUNNING
> > > > >> 2017-10-09 09:42:20,533 [26248359-2fc8-d177-c3a6-
> > > 507f6857e0ea:frag:2:3]
> > > > >> INFO  o.a.d.e.w.f.FragmentStatusReporter -
> > > > >> 26248359-2fc8-d177-c3a6-507f6857e0ea:2:3: State to report:
> > > > >> RUNNING
> > > > >> 2017-10-09 09:42:20,534 [26248359-2fc8-d177-c3a6-
> > > 507f6857e0ea:frag:2:3]
> > > > >> INFO  o.a.d.e.w.fragment.FragmentExecutor -
> > > > >> 26248359-2fc8-d177-c3a6-507f6857e0ea:2:3: State change requested
> > > > RUNNING
> > > > >> --> CANCELLATION_REQUESTED
> > > > >> 2017-10-09 09:42:20,534 [26248359-2fc8-d177-c3a6-
> > > 507f6857e0ea:frag:2:3]
> > > > >> INFO  o.a.d.e.w.f.FragmentStatusReporter -
> > > > >> 26248359-2fc8-d177-c3a6-507f6857e0ea:2:3: State to report:
> > > > >> CANCELLATION_REQUESTED
> > > > >>
> > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message