drill-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bridg...@apache.org
Subject [drill] branch gh-pages updated: edit docs
Date Fri, 31 May 2019 03:07:01 GMT
This is an automated email from the ASF dual-hosted git repository.

bridgetb pushed a commit to branch gh-pages
in repository https://gitbox.apache.org/repos/asf/drill.git


The following commit(s) were added to refs/heads/gh-pages by this push:
     new 11acd5f  edit docs
11acd5f is described below

commit 11acd5faa756a1b1be0519fc6335116d0ec793fb
Author: Bridget Bevens <bbevens@maprtech.com>
AuthorDate: Thu May 30 20:05:51 2019 -0700

    edit docs
---
 .../011-running-drill-on-docker.md                    | 15 ++++++---------
 .../025-optimizing-parquet-reading.md                 |  4 ++--
 _docs/sql-reference/sql-commands/009-analyze-table.md | 19 ++++++++++---------
 .../sql-commands/011-refresh-table-metadata.md        |  4 ++--
 _docs/sql-reference/sql-commands/021-create-schema.md |  6 ++++--
 5 files changed, 24 insertions(+), 24 deletions(-)

diff --git a/_docs/install/installing-drill-in-embedded-mode/011-running-drill-on-docker.md
b/_docs/install/installing-drill-in-embedded-mode/011-running-drill-on-docker.md
index f3e49b2..c1c9de0 100644
--- a/_docs/install/installing-drill-in-embedded-mode/011-running-drill-on-docker.md
+++ b/_docs/install/installing-drill-in-embedded-mode/011-running-drill-on-docker.md
@@ -1,6 +1,6 @@
 ---
 title: "Running Drill on Docker"
-date: 2019-05-24
+date: 2019-05-31
 parent: "Installing Drill in Embedded Mode"
 ---  
 
@@ -10,19 +10,16 @@ Currently, you can only run Drill in embedded mode in a Docker container.
Embedd
 
 ## Prerequisite  
 
-You must have the Docker client (version 18 or later) installed on your machine.  
+You must have the Docker client (version 18 or later) [installed on your machine](https://docs.docker.com/install/).
 
 
-- [Docker for Mac](https://www.docker.com/docker-mac)  
-- [Docker for Windows](https://www.docker.com/docker-windows)  
-- [Docker for Oracle Linux](https://www.docker.com/docker-oracle-linux)  
 
 ## Running Drill in a Docker Container  
 
-You can start and run a Docker container in “detached” mode or “foreground” mode.
Foreground is the default mode. Foreground mode runs the Drill process in the container and
attaches the console to Drill’s standard input, output, and standard error. Detached mode
runs the container in the background.
+You can start and run a Docker container in detached mode or foreground mode. [Detached mode]({{site.baseurl}}/docs/running-drill-on-docker/#running-the-drill-docker-container-in-detached-mode)
runs the container in the background. Foreground is the default mode. [Foreground mode]({{site.baseurl}}/docs/running-drill-on-docker/#running-the-drill-docker-container-in-foreground-mode)
runs the Drill process in the container and attaches the console to Drill’s standard input,
output, and stan [...]
 
-Whether you run the Docker container in detached or foreground mode, you start Drill in a
container by issuing the docker run command with some options. 
+Whether you run the Docker container in detached or foreground mode, you start Drill in a
container by issuing the docker `run` command with some options, as described in the following
table: 
 
-The following table describes the options:  
+ 
 
 | Option                       | Description                                            
                                                                                         
                                                                                         
                                                                             |
 |------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
@@ -55,7 +52,7 @@ At the prompt, you can enter the following simple query to verify that Drill
is
 
 Open a terminal window (Command Prompt or PowerShell, but not PowerShell ISE) and then issue
the following commands and options to connect to SQLLine (the Drill shell):  
 
-**Note:** When you run the Drill Docker container in Detached mode, you connect to SQLLine
(the Drill shell) using drill-localhost.  
+**Note:** When you run the Drill Docker container in detached mode, you connect to SQLLine
(the Drill shell) using drill-localhost.  
 
        $ docker run -i --name drill-1.16.0 -p 8047:8047 --detach -t drill/apache-drill:1.16.0
/bin/bash
        <displays container ID>
diff --git a/_docs/performance-tuning/025-optimizing-parquet-reading.md b/_docs/performance-tuning/025-optimizing-parquet-reading.md
index 65884af..a37fc53 100644
--- a/_docs/performance-tuning/025-optimizing-parquet-reading.md
+++ b/_docs/performance-tuning/025-optimizing-parquet-reading.md
@@ -1,11 +1,11 @@
 ---
 title: "Optimizing Parquet Metadata Reading"
-date: 2017-08-10 22:24:37 UTC
+date: 2019-05-31
 parent: "Performance Tuning"
 ---
 
 Parquet metadata caching is a feature that enables Drill to read a single metadata cache
file instead of retrieving metadata from multiple Parquet files during the query-planning
phase. 
-Parquet metadata caching is available for Parquet data in Drill 1.2 and later. To enable
Parquet metadata caching, issue the REFRESH TABLE METADATA <path to table> command.
When you run this command Drill generates a metadata cache file.  
+Parquet metadata caching is available for Parquet data in Drill 1.2 and later. To enable
Parquet metadata caching, issue the [REFRESH TABLE METADATA]({{site.baseurl}}/docs/refresh-table-metadata/)
<path to table> command. When you run this command Drill generates a metadata cache
file.  
 
 {% include startnote.html %}Parquet metadata caching does not benefit queries on Hive tables,
HBase tables, or text files.{% include endnote.html %}  
 
diff --git a/_docs/sql-reference/sql-commands/009-analyze-table.md b/_docs/sql-reference/sql-commands/009-analyze-table.md
index 544f4a9..9251d0e 100644
--- a/_docs/sql-reference/sql-commands/009-analyze-table.md
+++ b/_docs/sql-reference/sql-commands/009-analyze-table.md
@@ -1,16 +1,17 @@
 ---
 title: "ANALYZE TABLE"
-date: 2019-05-02
+date: 2019-05-31
 parent: "SQL Commands"
 ---  
 
-Drill 1.16 and later supports the ANALYZE TABLE statement. The ANALYZE TABLE statement computes
statistics on Parquet data stored in tables and directories. ANALYZE TABLE writes statistics
to a JSON file in the `.stats.drill` directory, for example `/user/table1/.stats.drill/0_0.json`.
The optimizer in Drill uses these statistics to estimate filter, aggregation, and join cardinalities
and create more efficient query plans. 
+Drill 1.16 and later supports the ANALYZE TABLE statement. The ANALYZE TABLE statement computes
statistics on Parquet data stored in tables and directories. The optimizer in Drill uses statistics
to estimate filter, aggregation, and join cardinalities and create an optimal query plan.

+ANALYZE TABLE writes statistics to a JSON file in the `.stats.drill` directory, for example
`/user/table1/.stats.drill/0_0.json`. 
 
-You can run the ANALYZE TABLE statement to calculate statistics for tables, columns, and
directories with Parquet data; however, Drill will not use the statistics for query planning
unless you enable the `planner.statistics.use` option, as shown:
+Drill will not use the statistics for query planning unless you enable the `planner.statistics.use`
option, as shown:
 
 	SET `planner.statistics.use` = true;
 
-Alternatively, you can enable the option in the Drill Web UI at `http://<drill-hostname-or-ip>:8047/options`.
+Alternatively, you can enable the option in the Drill Web UI at `http://<drill-hostname-or-ip-address>:8047/options`.
 
 ## Syntax
 
@@ -41,15 +42,15 @@ An integer that specifies the percentage of data on which to compute statistics.
 
 ## Related Command  
 
-If you drop a table on which you have run ANALYZE TABLE, the statistics are automatically
removed with the table:  
+If you drop a table that you have already run ANALYZE TABLE against, the statistics are automatically
removed with the table:  
 
 	DROP TABLE [IF EXISTS] [workspace.]name  
 
-If you want to remove statistics for a table (and keep the table), you must remove the directory
in which Drill stores the statistics:  
+To remove statistics for a table you want to keep, you must remove the directory in which
Drill stores the statistics:  
 
 	DROP TABLE [IF EXISTS] [workspace.]name/.stats.drill  
 
-If you have already issued the ANALYZE TABLE statement against specific columns, a table,
or directory, you must run the DROP TABLE statement with `/.stats.drill` before you can successfully
run the ANALYZE TABLE statement against the data source again, for example:
+If you have already issued the ANALYZE TABLE statement against specific columns, table, or
directory, you must run the DROP TABLE statement with `/.stats.drill` before you can successfully
run the ANALYZE TABLE statement against the data source again, for example:
 
 	DROP TABLE `table_stats/Tpch0.01/parquet/customer/.stats.drill`;
 
@@ -146,10 +147,10 @@ For the predicate `"WHERE a = 5"`, in the example histogram above, you
can see t
  
 Next, consider the range predicate `"WHERE a > 5 AND a <= 16"`.  The range spans part
of bucket [1, 7] and entire buckets [8, 9], [10, 11] and [12, 16].  The total estimate = (7-5)/7
* 16 + 16 + 16 + 16 = 53 (approximately).  The actual count is 59.
 
-**Viewing Histogram Statistics for a Column**
+**Viewing Histogram Statistics for a Column**  
 Histogram statistics are generated for each column, as shown:  
 
-qhistogram":{"category":"numeric-equi-depth","numRowsPerBucket":150,"buckets":[0.0,2.0,4.0,7.0,9.0,12.0,15.199999999999978,17.0,19.0,22.0,24.0]
+	qhistogram":{"category":"numeric-equi-depth","numRowsPerBucket":150,"buckets":[0.0,2.0,4.0,7.0,9.0,12.0,15.199999999999978,17.0,19.0,22.0,24.0]
 
 In this example, there are 10 buckets. Each bucket contains 150 rows, which is calculated
as the number of rows (1500)/number of buckets (10). The list of numbers for the “buckets”
property indicates bucket boundaries, with the first bucket starting at 0.0 and ending at
2.0. The end of the first bucket is the start point for the second bucket, such that the second
bucket starts at 2.0 and ends at 4.0, and so on for the remainder of the buckets. 
   
diff --git a/_docs/sql-reference/sql-commands/011-refresh-table-metadata.md b/_docs/sql-reference/sql-commands/011-refresh-table-metadata.md
index ebfbb28..c2ddca0 100644
--- a/_docs/sql-reference/sql-commands/011-refresh-table-metadata.md
+++ b/_docs/sql-reference/sql-commands/011-refresh-table-metadata.md
@@ -1,6 +1,6 @@
 ---
 title: "REFRESH TABLE METADATA"
-date: 2019-05-24
+date: 2019-05-31
 parent: "SQL Commands"
 ---
 Run the REFRESH TABLE METADATA command on Parquet tables and directories to generate a metadata
cache file. REFRESH TABLE METADATA collects metadata from the footers of Parquet files and
writes the metadata to a metadata file (`.drill.parquet_file_metadata.v4`) and a summary file
(`.drill.parquet_summary_metadata.v4`). The planner uses the metadata cache file to prune
extraneous data during the query planning phase. Run the REFRESH TABLE METADATA command if
planning time is a significant [...]
@@ -97,7 +97,7 @@ These examples use a schema, `dfs.samples`, which points to the `/tmp` directory
 **Note:** You can access the sample `nation.parquet` file in the `sample-data` directory
of your Drill installation.
 
  
-Change schemas to switch to `dfs.samples`: 
+Change to the `dfs.samples` schema: 
 
 	use dfs.samples;
 	+-------+------------------------------------------+
diff --git a/_docs/sql-reference/sql-commands/021-create-schema.md b/_docs/sql-reference/sql-commands/021-create-schema.md
index 7477a0d..260eb38 100644
--- a/_docs/sql-reference/sql-commands/021-create-schema.md
+++ b/_docs/sql-reference/sql-commands/021-create-schema.md
@@ -1,6 +1,6 @@
 ---
 title: "CREATE OR REPLACE SCHEMA"
-date: 2019-05-02
+date: 2019-05-31
 parent: "SQL Commands"
 ---
 
@@ -625,7 +625,9 @@ You can easily drop the schema for a table using the DROP SCHEMA [IF EXISTS]
FOR
 ##Troubleshooting 
 
 **Schema defined as incorrect data type produces DATA_READ_ERROR**  
-Assume that you defined schema on the “name” column as integer, as shown:
+  
+Assume that you defined schema on the “name” column as integer, as shown:  
+  
 	create or replace schema (name int) for table dfs.tmp.`text_table`;
 	+------+-----------------------------------------+
 	|  ok  |                 summary                 |


Mime
View raw message