drill-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bridg...@apache.org
Subject drill-site git commit: add content to doc for DRILL-DRILL-5379
Date Wed, 09 Aug 2017 21:10:08 GMT
Repository: drill-site
Updated Branches:
  refs/heads/asf-site 767776a7e -> 21aafae8a

add content to doc for DRILL-DRILL-5379

Project: http://git-wip-us.apache.org/repos/asf/drill-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/drill-site/commit/21aafae8
Tree: http://git-wip-us.apache.org/repos/asf/drill-site/tree/21aafae8
Diff: http://git-wip-us.apache.org/repos/asf/drill-site/diff/21aafae8

Branch: refs/heads/asf-site
Commit: 21aafae8a6ef4766431b03f45c213dd59f682531
Parents: 767776a
Author: Bridget Bevens <bbevens@maprtech.com>
Authored: Wed Aug 9 14:09:54 2017 -0700
Committer: Bridget Bevens <bbevens@maprtech.com>
Committed: Wed Aug 9 14:09:54 2017 -0700

 docs/parquet-format/index.html | 17 ++++++++++-------
 feed.xml                       |  4 ++--
 2 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/docs/parquet-format/index.html b/docs/parquet-format/index.html
index d4ff57a..9f689fe 100644
--- a/docs/parquet-format/index.html
+++ b/docs/parquet-format/index.html
@@ -1124,7 +1124,7 @@
-     Mar 27, 2017
+     Aug 9, 2017
     <link href="/css/docpage.css" rel="stylesheet" type="text/css">
@@ -1184,8 +1184,7 @@
 <p>Use the ALTER command to set the <code>store.format</code> option. 
-<p><code>ALTER SESSION SET `store.format` = &#39;parquet&#39;;</code><br>
-<code>ALTER SYSTEM SET `store.format` = &#39;parquet&#39;;</code>  </p>
+<p><code>ALTER SYSTEM|SESSION SET `store.format` = &#39;parquet&#39;;</code>
 <h3 id="configuring-the-size-of-parquet-files">Configuring the Size of Parquet Files</h3>
@@ -1193,13 +1192,17 @@
 <p>The larger the block size, the more memory Drill needs for buffering data. Parquet
files that contain a single block maximize the amount of data Drill stores contiguously on
disk. Given a single row group per file, Drill stores the entire Parquet file onto the block,
avoiding network I/O.</p>
-<p>To maximize performance, set the target size of a Parquet row group to the number
of bytes less than or equal to the block size of MFS, HDFS, or the file system by using the
<code>store.parquet.block-size</code>:  </p>
+<p>To maximize performance, set the target size of a Parquet row group to the number
of bytes less than or equal to the block size of MFS, HDFS, or the file system using the <code>store.parquet.block-size</code>
option, as shown:  </p>
-<p><code>ALTER SESSION SET `store.parquet.block-size` = 536870912;</code><br>
-<code>ALTER SYSTEM SET `store.parquet.block-size` = 536870912</code>  </p>
+<p><code>ALTER SYSTEM|SESSION SET `store.parquet.block-size` = 536870912;</code>
-<p>The default block size is 536870912 bytes.</p>
+<p>The default block size is 536870912 bytes.  </p>
+<h3 id="configuring-the-hdfs-block-size-for-parquet-files">Configuring the HDFS Block
Size for Parquet Files</h3>
+<p>Drill 1.11 introduces the <code>store.parquet.writer.use_single_fs_block</code>
option, which enables Drill to write a Parquet file as a single file system block without
changing the default file system block size. Query performance improves when Drill reads Parquet
files as a single block on the file system. When the <code>store.parquet.writer.use_single_fs_block</code>
option is enabled, the <code>store.parquet.block-size</code> setting determines
the block size of the Parquet files created. The default setting for the <code>store.parquet.writer.use_single_fs_block</code>
option is &#39;false&#39;. Use the SET command to enable or disable the option, as
shown:  </p>
+<div class="highlight"><pre><code class="language-text" data-lang="text">ALTER
SYSTEM|SESSION SET store.parquet.writer.use_single_fs_block = &#39;true|false&#39;;
 <h3 id="type-mapping">Type Mapping</h3>
 <p>The high correlation between Parquet and SQL data types makes reading Parquet files
effortless in Drill. Writing to Parquet files takes more work than reading. Because SQL does
not support all Parquet data types, to prevent Drill from inferring a type other than one
you want, use the <a href="/docs/data-type-conversion/#cast">cast function</a>
Drill offers more liberal casting capabilities than SQL for Parquet conversions if the Parquet
data is of a logical type. </p>

diff --git a/feed.xml b/feed.xml
index 95d15da..3740c8a 100644
--- a/feed.xml
+++ b/feed.xml
@@ -6,8 +6,8 @@
     <atom:link href="/feed.xml" rel="self" type="application/rss+xml"/>
-    <pubDate>Tue, 08 Aug 2017 14:32:01 -0700</pubDate>
-    <lastBuildDate>Tue, 08 Aug 2017 14:32:01 -0700</lastBuildDate>
+    <pubDate>Wed, 09 Aug 2017 14:08:05 -0700</pubDate>
+    <lastBuildDate>Wed, 09 Aug 2017 14:08:05 -0700</lastBuildDate>
     <generator>Jekyll v2.5.2</generator>

View raw message