drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From PROJJWAL SAHA <proj.s...@gmail.com>
Subject Re: Exception while reading parquet data
Date Mon, 16 Oct 2017 15:19:38 GMT
here is the link for the parquet data.
https://drive.google.com/file/d/0BzZhvMHOeao1S2Rud2xDS1NyS00/view?usp=sharing

Setting store.parquet.reader.pagereader.bufferedread=false did not solve
the issue.

I am using Drill 1.11. The parquet data is fetched from Oracle Storage
Cloud Service using swift driver.

Here is the error on the drill command prompt -
Error: DATA_READ ERROR: Exception occurred while reading from disk.

File:
/data1GBparquet/storereturns/part-00000-7ce26fde-f342-4aae-a727-71b8b7a60e63.parquet
Column:  sr_return_time_sk
Row Group Start:  417866
File:
/data1GBparquet/storereturns/part-00000-7ce26fde-f342-4aae-a727-71b8b7a60e63.parquet
Column:  sr_return_time_sk
Row Group Start:  417866
Fragment 0:0

On Sun, Oct 15, 2017 at 8:59 PM, Kunal Khatua <kkhatua@mapr.com> wrote:

> You could try uploading to Google Drive (since you have a Gmail account)
> and share the link .
>
> Did Parth's suggestion of
> store.parquet.reader.pagereader.bufferedread=false
> resolve the issue?
>
> Also share the details of the hardware setup... #nodes, Hadoop version,
> etc.
>
>
> -----Original Message-----
> From: PROJJWAL SAHA [mailto:proj.saha@gmail.com]
> Sent: Sunday, October 15, 2017 8:07 AM
> To: user@drill.apache.org
> Subject: Re: Exception while reading parquet data
>
> Is there any place where I can upload the 12MB parquet data. I am not able
> to send the file through mail to the user group.
>
> On Thu, Oct 12, 2017 at 10:58 PM, Parth Chandra <parthc@apache.org> wrote:
>
> > Seems like a bug in BufferedDirectBufInputStream.  Is it possible to
> > share a minimal data file that triggers this?
> >
> > You can also try turning off the buffering reader.
> >    store.parquet.reader.pagereader.bufferedread=false
> >
> > With async reader on and buffering off, you might not see any
> > degradation in performance in most cases.
> >
> >
> >
> > On Thu, Oct 12, 2017 at 2:08 AM, PROJJWAL SAHA <proj.saha@gmail.com>
> > wrote:
> >
> > > hi,
> > >
> > > disabling sync parquet reader doesnt solve the problem. I am getting
> > > similar exception I dont see any issue with the parquet file since
> > > the same file works on loading the same on alluxio.
> > >
> > > 2017-10-12 04:19:50,502
> > > [2620da63-4efb-47e2-5e2c-29b48c0194c0:frag:0:0] ERROR
> > > o.a.d.e.u.f.BufferedDirectBufInputStream - Error reading from stream
> > > part-00000-7ce26fde-f342-4aae-a727-71b8b7a60e63.parquet. Error was :
> > > null
> > > 2017-10-12 04:19:50,506
> > > [2620da63-4efb-47e2-5e2c-29b48c0194c0:frag:0:0] ERROR
> > > o.a.d.exec.physical.impl.ScanBatch - SYSTEM ERROR:
> > > IndexOutOfBoundsException
> > >
> > >
> > > [Error Id: 3b7c4587-c1b8-4e79-bdaa-b2aa1516275b ]
> > > org.apache.drill.common.exceptions.UserException: SYSTEM ERROR:
> > > IndexOutOfBoundsException
> > >
> > >
> > > [Error Id: 3b7c4587-c1b8-4e79-bdaa-b2aa1516275b ]
> > >         at org.apache.drill.common.exceptions.UserException$
> > > Builder.build(UserException.java:550)
> > > ~[drill-common-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.physical.impl.ScanBatch.next(
> > > ScanBatch.java:249)
> > > [drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > AbstractRecordBatch.java:119)
> > > [drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > AbstractRecordBatch.java:109)
> > > [drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.record.AbstractSingleRecordBatch.
> > > innerNext(AbstractSingleRecordBatch.java:51)
> > > [drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > AbstractRecordBatch.java:162)
> > > [drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > AbstractRecordBatch.java:119)
> > > [drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > AbstractRecordBatch.java:109)
> > > [drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.record.AbstractSingleRecordBatch.
> > > innerNext(AbstractSingleRecordBatch.java:51)
> > > [drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.physical.impl.svremover.
> > > RemovingRecordBatch.innerNext(RemovingRecordBatch.java:93)
> > > [drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > AbstractRecordBatch.java:162)
> > > [drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > AbstractRecordBatch.java:119)
> > > [drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > AbstractRecordBatch.java:109)
> > > [drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.record.AbstractSingleRecordBatch.
> > > innerNext(AbstractSingleRecordBatch.java:51)
> > > [drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.physical.impl.project.
> > > ProjectRecordBatch.innerNext(ProjectRecordBatch.java:133)
> > > [drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > AbstractRecordBatch.java:162)
> > > [drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > AbstractRecordBatch.java:119)
> > > [drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > AbstractRecordBatch.java:109)
> > > [drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.physical.impl.aggregate.
> > > HashAggBatch.buildSchema(HashAggBatch.java:111)
> > > [drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > AbstractRecordBatch.java:142)
> > > [drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > AbstractRecordBatch.java:119)
> > > [drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > AbstractRecordBatch.java:109)
> > > [drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.physical.impl.xsort.
> > > ExternalSortBatch.buildSchema(ExternalSortBatch.java:264)
> > > [drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > AbstractRecordBatch.java:142)
> > > [drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > AbstractRecordBatch.java:119)
> > > [drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > AbstractRecordBatch.java:109)
> > > [drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.record.AbstractSingleRecordBatch.
> > > innerNext(AbstractSingleRecordBatch.java:51)
> > > [drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.physical.impl.svremover.
> > > RemovingRecordBatch.innerNext(RemovingRecordBatch.java:93)
> > > [drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > AbstractRecordBatch.java:162)
> > > [drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > AbstractRecordBatch.java:119)
> > > [drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > AbstractRecordBatch.java:109)
> > > [drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.record.AbstractSingleRecordBatch.
> > > innerNext(AbstractSingleRecordBatch.java:51)
> > > [drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.physical.impl.project.
> > > ProjectRecordBatch.innerNext(ProjectRecordBatch.java:133)
> > > [drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > AbstractRecordBatch.java:162)
> > > [drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > AbstractRecordBatch.java:119)
> > > [drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > AbstractRecordBatch.java:109)
> > > [drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.record.AbstractSingleRecordBatch.
> > > innerNext(AbstractSingleRecordBatch.java:51)
> > > [drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.physical.impl.limit.
> > > LimitRecordBatch.innerNext(LimitRecordBatch.java:115)
> > > [drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > AbstractRecordBatch.java:162)
> > > [drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > AbstractRecordBatch.java:119)
> > > [drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > AbstractRecordBatch.java:109)
> > > [drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.record.AbstractSingleRecordBatch.
> > > innerNext(AbstractSingleRecordBatch.java:51)
> > > [drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.physical.impl.svremover.
> > > RemovingRecordBatch.innerNext(RemovingRecordBatch.java:93)
> > > [drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > AbstractRecordBatch.java:162)
> > > [drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > AbstractRecordBatch.java:119)
> > > [drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > AbstractRecordBatch.java:109)
> > > [drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.record.AbstractSingleRecordBatch.
> > > innerNext(AbstractSingleRecordBatch.java:51)
> > > [drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.physical.impl.project.
> > > ProjectRecordBatch.innerNext(ProjectRecordBatch.java:133)
> > > [drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.record.AbstractRecordBatch.next(
> > > AbstractRecordBatch.java:162)
> > > [drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.physical.impl.BaseRootExec.
> > > next(BaseRootExec.java:105)
> > > [drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.physical.impl.ScreenCreator$
> > > ScreenRoot.innerNext(ScreenCreator.java:81)
> > > [drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.physical.impl.BaseRootExec.
> > > next(BaseRootExec.java:95)
> > > [drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.work.fragment.FragmentExecutor$1.
> > > run(FragmentExecutor.java:234)
> > > [drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.work.fragment.FragmentExecutor$1.
> > > run(FragmentExecutor.java:227)
> > > [drill-java-exec-1.11.0.jar:1.11.0]
> > >         at java.security.AccessController.doPrivileged(Native
> > > Method) [na:1.8.0_121]
> > >         at javax.security.auth.Subject.doAs(Subject.java:422)
> > > [na:1.8.0_121]
> > >         at org.apache.hadoop.security.UserGroupInformation.doAs(
> > > UserGroupInformation.java:1657)
> > > [hadoop-common-2.7.1.jar:na]
> > >         at org.apache.drill.exec.work.fragment.FragmentExecutor.run(
> > > FragmentExecutor.java:227)
> > > [drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.common.SelfCleaningRunnable.run(
> > > SelfCleaningRunnable.java:38)
> > > [drill-common-1.11.0.jar:1.11.0]
> > >         at java.util.concurrent.ThreadPoolExecutor.runWorker(
> > > ThreadPoolExecutor.java:1142)
> > > [na:1.8.0_121]
> > >         at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> > > ThreadPoolExecutor.java:617)
> > > [na:1.8.0_121]
> > >         at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121]
> > > Caused by: org.apache.drill.common.exceptions.DrillRuntimeException:
> > > Error in parquet record reader.
> > > Message:
> > > Hadoop path:
> > > /data1GBparquet/storereturns/part-00000-7ce26fde-f342-4aae-
> > > a727-71b8b7a60e63.parquet
> > > Total records read: 0
> > > Row group index: 0
> > > Records in row group: 287514
> > > Parquet Metadata: ParquetMetaData{FileMetaData{schema: message
> > > spark_schema {
> > >   optional int32 sr_returned_date_sk;
> > >   optional int32 sr_return_time_sk;
> > >   optional int32 sr_item_sk;
> > >   optional int32 sr_customer_sk;
> > >   optional int32 sr_cdemo_sk;
> > >   optional int32 sr_hdemo_sk;
> > >   optional int32 sr_addr_sk;
> > >   optional int32 sr_store_sk;
> > >   optional int32 sr_reason_sk;
> > >   optional int32 sr_ticket_number;
> > >   optional int32 sr_return_quantity;
> > >   optional double sr_return_amt;
> > >   optional double sr_return_tax;
> > >   optional double sr_return_amt_inc_tax;
> > >   optional double sr_fee;
> > >   optional double sr_return_ship_cost;
> > >   optional double sr_refunded_cash;
> > >   optional double sr_reversed_charge;
> > >   optional double sr_store_credit;
> > >   optional double sr_net_loss;
> > >   optional binary sr_dummycol (UTF8); } , metadata:
> > > {org.apache.spark.sql.parquet.row.metadata={"type":"struct",
> > > "fields":[{"name":"sr_returned_date_sk","type":"
> > integer","nullable":true,"
> > > metadata":{}},{"name":"sr_return_time_sk","type":"
> > > integer","nullable":true,"metadata":{}},{"name":"sr_
> > > item_sk","type":"integer","nullable":true,"metadata":{}},
> > > {"name":"sr_customer_sk","type":"integer","nullable":
> > > true,"metadata":{}},{"name":"sr_cdemo_sk","type":"integer",
> > > "nullable":true,"metadata":{}},{"name":"sr_hdemo_sk","type":
> > > "integer","nullable":true,"metadata":{}},{"name":"sr_
> > > addr_sk","type":"integer","nullable":true,"metadata":{}},
> > > {"name":"sr_store_sk","type":"integer","nullable":true,"
> > > metadata":{}},{"name":"sr_reason_sk","type":"integer","
> > > nullable":true,"metadata":{}},{"name":"sr_ticket_number","
> > > type":"integer","nullable":true,"metadata":{}},{"name":"
> > > sr_return_quantity","type":"integer","nullable":true,"
> > > metadata":{}},{"name":"sr_return_amt","type":"double","
> > > nullable":true,"metadata":{}},{"name":"sr_return_tax","type"
> > > :"double","nullable":true,"metadata":{}},{"name":"sr_
> > > return_amt_inc_tax","type":"double","nullable":true,"
> > > metadata":{}},{"name":"sr_fee","type":"double","nullable":
> > > true,"metadata":{}},{"name":"sr_return_ship_cost","type":"
> > > double","nullable":true,"metadata":{}},{"name":"sr_
> > > refunded_cash","type":"double","nullable":true,"metadata":{}
> > > },{"name":"sr_reversed_charge","type":"double","nullable":
> > > true,"metadata":{}},{"name":"sr_store_credit","type":"
> > > double","nullable":true,"metadata":{}},{"name":"sr_net_
> > > loss","type":"double","nullable":true,"metadata":{}},
> > > {"name":"sr_dummycol","type":"string","nullable":true,"
> > metadata":{}}]}}},
> > > blocks: [BlockMetaData{287514, 18570101 [ColumnMetaData{UNCOMPRESSED
> > > [sr_returned_date_sk] INT32  [RLE, PLAIN_DICTIONARY, BIT_PACKED],
> > > 4}, ColumnMetaData{UNCOMPRESSED [sr_return_time_sk] INT32  [RLE,
> > > PLAIN_DICTIONARY, BIT_PACKED], 417866}, ColumnMetaData{UNCOMPRESSED
> > > [sr_item_sk] INT32  [RLE, PLAIN_DICTIONARY, BIT_PACKED], 1096347},
> > > ColumnMetaData{UNCOMPRESSED [sr_customer_sk] INT32  [RLE,
> > > PLAIN_DICTIONARY, BIT_PACKED], 1708118}, ColumnMetaData{UNCOMPRESSED
> > > [sr_cdemo_sk] INT32  [RLE, PLAIN, BIT_PACKED], 2674001},
> > > ColumnMetaData{UNCOMPRESSED [sr_hdemo_sk] INT32  [RLE,
> > > PLAIN_DICTIONARY, BIT_PACKED], 3812205}, ColumnMetaData{UNCOMPRESSED
> > > [sr_addr_sk] INT32  [RLE, PLAIN_DICTIONARY, BIT_PACKED], 4320246},
> > > ColumnMetaData{UNCOMPRESSED [sr_store_sk] INT32  [RLE,
> > > PLAIN_DICTIONARY, BIT_PACKED], 5102635}, ColumnMetaData{UNCOMPRESSED
> > > [sr_reason_sk] INT32  [RLE, PLAIN_DICTIONARY, BIT_PACKED], 5235151},
> > > ColumnMetaData{UNCOMPRESSED [sr_ticket_number] INT32  [RLE, PLAIN,
> > > BIT_PACKED], 5471579}, ColumnMetaData{UNCOMPRESSED
> > > [sr_return_quantity] INT32  [RLE, PLAIN_DICTIONARY, BIT_PACKED],
> > > 6621731}, ColumnMetaData{UNCOMPRESSED [sr_return_amt] DOUBLE  [RLE,
> > > PLAIN_DICTIONARY, BIT_PACKED], 6893357}, ColumnMetaData{UNCOMPRESSED
> > > [sr_return_tax] DOUBLE  [RLE, PLAIN_DICTIONARY, BIT_PACKED],
> > > 8419465}, ColumnMetaData{UNCOMPRESSED [sr_return_amt_inc_tax] DOUBLE
> > > [RLE, PLAIN, PLAIN_DICTIONARY, BIT_PACKED], 9201856},
> > > ColumnMetaData{UNCOMPRESSED [sr_fee] DOUBLE  [RLE, PLAIN_DICTIONARY,
> > > BIT_PACKED], 11366007}, ColumnMetaData{UNCOMPRESSED
> > > [sr_return_ship_cost] DOUBLE  [RLE, PLAIN_DICTIONARY, BIT_PACKED],
> > > 11959880}, ColumnMetaData{UNCOMPRESSED [sr_refunded_cash] DOUBLE
> > > [RLE, PLAIN_DICTIONARY, BIT_PACKED], 13218730},
> > > ColumnMetaData{UNCOMPRESSED [sr_reversed_charge] DOUBLE  [RLE,
> > > PLAIN_DICTIONARY, BIT_PACKED], 14635937},
> > > ColumnMetaData{UNCOMPRESSED [sr_store_credit] DOUBLE  [RLE,
> > > PLAIN_DICTIONARY, BIT_PACKED], 15824898},
> > > ColumnMetaData{UNCOMPRESSED [sr_net_loss] DOUBLE  [RLE,
> > > PLAIN_DICTIONARY, BIT_PACKED], 17004301}, ColumnMetaData{UNCOMPRESSED
> [sr_dummycol] BINARY  [RLE, PLAIN, BIT_PACKED], 18570072}]}]}
> > >         at org.apache.drill.exec.store.parquet.columnreaders.
> > > ParquetRecordReader.handleException(ParquetRecordReader.java:272)
> > > ~[drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.store.parquet.columnreaders.
> > > ParquetRecordReader.next(ParquetRecordReader.java:299)
> > > ~[drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.physical.impl.ScanBatch.next(
> > > ScanBatch.java:180)
> > > [drill-java-exec-1.11.0.jar:1.11.0]
> > >         ... 60 common frames omitted Caused by: java.io.IOException:
> > > java.lang.IndexOutOfBoundsException
> > >         at org.apache.drill.exec.util.filereader.
> > > BufferedDirectBufInputStream.getNextBlock(
> BufferedDirectBufInputStream.
> > > java:185)
> > > ~[drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.util.filereader.
> > > BufferedDirectBufInputStream.readInternal(
> BufferedDirectBufInputStream.
> > > java:212)
> > > ~[drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.util.filereader.
> > > BufferedDirectBufInputStream.read(BufferedDirectBufInputStream.java:
> > > 277) ~[drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.util.filereader.
> > > DirectBufInputStream.getNext(DirectBufInputStream.java:111)
> > > ~[drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.store.parquet.columnreaders.
> > > PageReader.readPage(PageReader.java:216)
> > > ~[drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.store.parquet.columnreaders.
> > > PageReader.nextInternal(PageReader.java:283)
> > > ~[drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.store.parquet.columnreaders.
> > > PageReader.next(PageReader.java:307)
> > > ~[drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.store.parquet.columnreaders.
> > > NullableColumnReader.processPages(NullableColumnReader.java:69)
> > > ~[drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.store.parquet.columnreaders.
> > BatchReader.
> > > readAllFixedFieldsSerial(BatchReader.java:63)
> > > ~[drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.store.parquet.columnreaders.
> > BatchReader.
> > > readAllFixedFields(BatchReader.java:56)
> > > ~[drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.store.parquet.columnreaders.
> > > BatchReader$FixedWidthReader.readRecords(BatchReader.java:143)
> > > ~[drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.store.parquet.columnreaders.
> > > BatchReader.readBatch(BatchReader.java:42)
> > > ~[drill-java-exec-1.11.0.jar:1.11.0]
> > >         at org.apache.drill.exec.store.parquet.columnreaders.
> > > ParquetRecordReader.next(ParquetRecordReader.java:297)
> > > ~[drill-java-exec-1.11.0.jar:1.11.0]
> > >         ... 61 common frames omitted Caused by:
> > > java.lang.IndexOutOfBoundsException: null
> > >         at java.nio.Buffer.checkBounds(Buffer.java:567)
> ~[na:1.8.0_121]
> > >         at java.nio.ByteBuffer.put(ByteBuffer.java:827)
> ~[na:1.8.0_121]
> > >         at java.nio.DirectByteBuffer.put(DirectByteBuffer.java:379)
> > > ~[na:1.8.0_121]
> > >         at org.apache.parquet.hadoop.util.CompatibilityUtil.getBuf(
> > > CompatibilityUtil.java:110)
> > > ~[parquet-hadoop-1.8.1-drill-r0.jar:1.8.1-drill-r0]
> > >         at org.apache.drill.exec.util.filereader.
> > > BufferedDirectBufInputStream.getNextBlock(
> BufferedDirectBufInputStream.
> > > java:182)
> > > ~[drill-java-exec-1.11.0.jar:1.11.0]
> > >         ... 73 common frames omitted
> > > 2017-10-12 04:19:50,506
> > > [2620da63-4efb-47e2-5e2c-29b48c0194c0:frag:0:0] INFO
> > > o.a.d.e.w.fragment.FragmentExecutor -
> > > 2620da63-4efb-47e2-5e2c-29b48c0194c0:0:0: State change requested
> > > RUNNING --> FAILED
> > > 2017-10-12 04:19:50,507
> > > [2620da63-4efb-47e2-5e2c-29b48c0194c0:frag:0:0] INFO
> > > o.a.d.e.w.fragment.FragmentExecutor -
> > > 2620da63-4efb-47e2-5e2c-29b48c0194c0:0:0: State change requested
> > > FAILED --> FINISHED
> > > 2017-10-12 04:19:50,533 [BitServer-2] WARN
> > > o.a.drill.exec.work.foreman.Foreman - Dropping request to move to
> > > COMPLETED state as query is already at FAILED state (which is
> > > terminal).
> > > 2017-10-12 04:19:50,533 [BitServer-2] WARN
> > > o.a.d.e.w.b.ControlMessageHandler - Dropping request to cancel
> > > fragment. 2620da63-4efb-47e2-5e2c-29b48c0194c0:0:0 does not exist.
> > >
> > >
> > >
> > > On Thu, Oct 12, 2017 at 1:49 PM, PROJJWAL SAHA <proj.saha@gmail.com>
> > > wrote:
> > >
> > > > sure, I can try disabling sync parquet reader.
> > > > Will this however, impact the performance of queries on parquet data
> ?
> > > >
> > > > On Thu, Oct 12, 2017 at 9:39 AM, Kunal Khatua <kkhatua@mapr.com>
> > wrote:
> > > >
> > > >> If this resolves the issue, could you share some additional
> > > >> details,
> > > such
> > > >> as the metadata of the Parquet files, the OS, etc.? Details
> > > >> describing
> > > the
> > > >> setup is also very helpful in identifying what could be the cause
> > > >> of
> > the
> > > >> error.
> > > >>
> > > >> We had observed some similar DATA_READ errors in the early
> > > >> iterations
> > of
> > > >> the Async Parquet reader, but those have been resolved. I'm
> > > >> presuming you're already on the latest (i.e. Apache Drill 1.11.0)
> > > >>
> > > >> -----Original Message-----
> > > >> From: Arjun kr [mailto:arjun.kr@outlook.com]
> > > >> Sent: Wednesday, October 11, 2017 6:52 PM
> > > >> To: user@drill.apache.org
> > > >> Subject: Re: Exception while reading parquet data
> > > >>
> > > >>
> > > >> Can you try disabling async parquet reader to see if problem gets
> > > >> resolved.
> > > >>
> > > >>
> > > >> alter session set `store.parquet.reader.pagereader.async`=false;
> > > >>
> > > >> Thanks,
> > > >>
> > > >> Arjun
> > > >>
> > > >>
> > > >> ________________________________
> > > >> From: PROJJWAL SAHA <proj.saha@gmail.com>
> > > >> Sent: Wednesday, October 11, 2017 2:20 PM
> > > >> To: user@drill.apache.org
> > > >> Subject: Exception while reading parquet data
> > > >>
> > > >> I get below exception when querying parquet data on Oracle
> > > >> Storage
> > Cloud
> > > >> service.
> > > >> Any pointers on what does this point to ?
> > > >>
> > > >> Regards,
> > > >> Projjwal
> > > >>
> > > >>
> > > >> ERROR o.a.d.e.u.f.BufferedDirectBufInputStream - Error reading
> > > >> from stream
> > > >> part-00006-25a9ae4b-fd9e-4770-b17e-9a29b270a4c2.parquet. Error
> > > >> was : null
> > > >> 2017-10-09 09:42:18,516 [scan-2] INFO
> > > >> o.a.d.e.s.p.c.AsyncPageReader - User Error Occurred: Exception
> occurred while reading from disk.
> > > >> (java.lang.IndexOutOfBoundsException)
> > > >> org.apache.drill.common.exceptions.UserException: DATA_READ ERROR:
> > > >> Exception occurred while reading from disk.
> > > >>
> > > >> File:
> > > >> /data25GB/storereturns/part-00006-25a9ae4b-fd9e-4770-b17e-9a
> > > >> 29b270a4c2.parquet
> > > >> Column:  sr_return_time_sk
> > > >> Row Group Start:  479751
> > > >>
> > > >> [Error Id: 10680bb8-d1d6-43a1-b5e0-ef15bd8a9406 ] at
> > > >> org.apache.drill.common.exceptions.UserException$Builder.
> > > >> build(UserException.java:550)
> > > >> ~[drill-common-1.11.0.jar:1.11.0] at
> > > >> org.apache.drill.exec.store.parquet.columnreaders.AsyncPageR
> > > >> eader.handleAndThrowException(AsyncPageReader.java:185)
> > > >> [drill-java-exec-1.11.0.jar:1.11.0]
> > > >> at
> > > >> org.apache.drill.exec.store.parquet.columnreaders.AsyncPageR
> > > >> eader.access$700(AsyncPageReader.java:82)
> > > >> [drill-java-exec-1.11.0.jar:1.11.0]
> > > >> at
> > > >> org.apache.drill.exec.store.parquet.columnreaders.AsyncPageR
> > > >> eader$AsyncPageReaderTask.call(AsyncPageReader.java:461)
> > > >> [drill-java-exec-1.11.0.jar:1.11.0]
> > > >> at
> > > >> org.apache.drill.exec.store.parquet.columnreaders.AsyncPageR
> > > >> eader$AsyncPageReaderTask.call(AsyncPageReader.java:381)
> > > >> [drill-java-exec-1.11.0.jar:1.11.0]
> > > >> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> > > >> [na:1.8.0_121] at
> > > >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
> > > >> Executor.java:1142)
> > > >> [na:1.8.0_121]
> > > >> at
> > > >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
> > > >> lExecutor.java:617)
> > > >> [na:1.8.0_121]
> > > >> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121] Caused by:
> > > >> java.io.IOException: java.lang.IndexOutOfBoundsException
> > > >> at
> > > >> org.apache.drill.exec.util.filereader.BufferedDirectBufInput
> > > >> Stream.getNextBlock(BufferedDirectBufInputStream.java:185)
> > > >> ~[drill-java-exec-1.11.0.jar:1.11.0]
> > > >> at
> > > >> org.apache.drill.exec.util.filereader.BufferedDirectBufInput
> > > >> Stream.readInternal(BufferedDirectBufInputStream.java:212)
> > > >> ~[drill-java-exec-1.11.0.jar:1.11.0]
> > > >> at
> > > >> org.apache.drill.exec.util.filereader.BufferedDirectBufInput
> > > >> Stream.read(BufferedDirectBufInputStream.java:277)
> > > >> ~[drill-java-exec-1.11.0.jar:1.11.0]
> > > >> at
> > > >> org.apache.drill.exec.util.filereader.DirectBufInputStream.
> > > >> getNext(DirectBufInputStream.java:111)
> > > >> ~[drill-java-exec-1.11.0.jar:1.11.0]
> > > >> at
> > > >> org.apache.drill.exec.store.parquet.columnreaders.AsyncPageR
> > > >> eader$AsyncPageReaderTask.call(AsyncPageReader.java:421)
> > > >> [drill-java-exec-1.11.0.jar:1.11.0]
> > > >> ... 5 common frames omitted
> > > >> Caused by: java.lang.IndexOutOfBoundsException: null at
> > > >> java.nio.Buffer.checkBounds(Buffer.java:567) ~[na:1.8.0_121] at
> > > >> java.nio.ByteBuffer.put(ByteBuffer.java:827) ~[na:1.8.0_121] at
> > > >> java.nio.DirectByteBuffer.put(DirectByteBuffer.java:379)
> > > ~[na:1.8.0_121]
> > > >> at
> > > >> org.apache.parquet.hadoop.util.CompatibilityUtil.getBuf(Comp
> > > >> atibilityUtil.java:110)
> > > >> ~[parquet-hadoop-1.8.1-drill-r0.jar:1.8.1-drill-r0]
> > > >> at
> > > >> org.apache.drill.exec.util.filereader.BufferedDirectBufInput
> > > >> Stream.getNextBlock(BufferedDirectBufInputStream.java:182)
> > > >> ~[drill-java-exec-1.11.0.jar:1.11.0]
> > > >> ... 9 common frames omitted
> > > >> 2017-10-09 09:42:20,533 [26248359-2fc8-d177-c3a6-
> > 507f6857e0ea:frag:2:3]
> > > >> INFO  o.a.d.e.w.fragment.FragmentExecutor -
> > > >> 26248359-2fc8-d177-c3a6-507f6857e0ea:2:3: State change requested
> > > >> AWAITING_ALLOCATION --> RUNNING
> > > >> 2017-10-09 09:42:20,533 [26248359-2fc8-d177-c3a6-
> > 507f6857e0ea:frag:2:3]
> > > >> INFO  o.a.d.e.w.f.FragmentStatusReporter -
> > > >> 26248359-2fc8-d177-c3a6-507f6857e0ea:2:3: State to report:
> > > >> RUNNING
> > > >> 2017-10-09 09:42:20,534 [26248359-2fc8-d177-c3a6-
> > 507f6857e0ea:frag:2:3]
> > > >> INFO  o.a.d.e.w.fragment.FragmentExecutor -
> > > >> 26248359-2fc8-d177-c3a6-507f6857e0ea:2:3: State change requested
> > > RUNNING
> > > >> --> CANCELLATION_REQUESTED
> > > >> 2017-10-09 09:42:20,534 [26248359-2fc8-d177-c3a6-
> > 507f6857e0ea:frag:2:3]
> > > >> INFO  o.a.d.e.w.f.FragmentStatusReporter -
> > > >> 26248359-2fc8-d177-c3a6-507f6857e0ea:2:3: State to report:
> > > >> CANCELLATION_REQUESTED
> > > >>
> > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message