drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jacques Nadeau (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (DRILL-1792) store.parquet.vector_fill_check_threshold is too high
Date Sun, 04 Jan 2015 23:08:34 GMT

     [ https://issues.apache.org/jira/browse/DRILL-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Jacques Nadeau resolved DRILL-1792.
    Resolution: Invalid

Due to the nature of some Parquet files, you'll need to set this setting lower.  To do so,
use the ALTER SESSION or ALTER SYSTEM command.  See here for changing the setting: https://cwiki.apache.org/confluence/display/DRILL/SQL+Commands+Summary

I don't remember what the default is offhand but you can view it by querying select * from

I think setting to 1 is the most conservative setting.

> store.parquet.vector_fill_check_threshold is too high
> -----------------------------------------------------
>                 Key: DRILL-1792
>                 URL: https://issues.apache.org/jira/browse/DRILL-1792
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Client - CLI
>    Affects Versions: 0.6.0
>         Environment: Linux, CentOS 6 latest, MapR 4.0.1
>            Reporter: hy5446
> I'm trying out some queries against parquet records. My query should return about 18
rows out of 2M. But:
> 0: jdbc:drill:> select * from dfs.`myfolder` as t where t.foo.bar = `foo bar`;
> /// headers here
> Query failed: Failure while running fragment., The setting for `store.parquet.vector_fill_check_threshold`
is too high for your Parquet records. Please set a lower check threshold and retry your query.
> I'm not sure how to proceed - there does not seem a lot of documentation about this.
What does that variable mean? What value to set it? And using what command?

This message was sent by Atlassian JIRA

View raw message