drill-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bridg...@apache.org
Subject [drill] branch gh-pages updated: Updates for new maprdb format plugin options, rn update
Date Fri, 24 May 2019 22:53:20 GMT
This is an automated email from the ASF dual-hosted git repository.

bridgetb pushed a commit to branch gh-pages
in repository https://gitbox.apache.org/repos/asf/drill.git


The following commit(s) were added to refs/heads/gh-pages by this push:
     new a7f649c  Updates for new maprdb format plugin options, rn update
a7f649c is described below

commit a7f649cd2c193395b87ad3eb9c0d3bf85ce667e7
Author: Bridget Bevens <bbevens@maprtech.com>
AuthorDate: Fri May 24 15:52:23 2019 -0700

    Updates for new maprdb format plugin options, rn update
---
 .../plugins/095-mapr-db-format.md                  |  6 ++-
 .../011-running-drill-on-docker.md                 | 14 +++---
 _docs/rn/005-1.16.0-rn.md                          |  4 +-
 .../sql-commands/011-refresh-table-metadata.md     |  7 ++-
 .../sql-functions/020-data-type-conversion.md      | 54 +++++++++++++++++++---
 5 files changed, 67 insertions(+), 18 deletions(-)

diff --git a/_docs/connect-a-data-source/plugins/095-mapr-db-format.md b/_docs/connect-a-data-source/plugins/095-mapr-db-format.md
index a1a0287..a2719ca 100644
--- a/_docs/connect-a-data-source/plugins/095-mapr-db-format.md
+++ b/_docs/connect-a-data-source/plugins/095-mapr-db-format.md
@@ -1,6 +1,6 @@
 ---
 title: "MapR-DB Format"
-date: 2018-06-26 00:42:18 UTC
+date: 2019-05-24
 parent: "Connect a Data Source"
 ---
 
@@ -16,6 +16,8 @@ Instead of including the name of a file, you include the table name in the
query
 
        SELECT * FROM mfs.`/users/max/mytable`;   
 
-**Note:** Starting in Drill 1.14, the MapR Drill installation package includes a hive-maprdb-json-handler,
which enables you to create Hive external tables from MapR-DB JSON tables and then query the
tables using the Hive schema. Drill can use the native Drill reader to read the Hive external
tables. The native Drill reader enables Drill to perform faster reads of data and apply filter
pushdown optimizations. The hive-maprdb-json-handler is not included in the Apache Drill installation
package.
+Starting in Drill 1.14, the MapR Drill installation package includes a hive-maprdb-json-handler,
which enables you to create Hive external tables from MapR-DB JSON tables and then query the
tables using the Hive schema. Drill can use the native Drill reader to read the Hive external
tables. The native Drill reader enables Drill to perform faster reads of data and apply filter
pushdown optimizations. The hive-maprdb-json-handler is not included in the Apache Drill installation
package.  
+
+Starting in Drill 1.16, you can include the `readTimestampWithZoneOffset` option in the maprdb
format plugin configuration. When enabled (set to 'true'), Drill converts timestamp values
from UTC to local time zone when reading the values from MapR Database. The option is disabled
by default and does not impact the `store.hive.maprdb_json.read_timestamp_with_timezone_offset`
setting.  
 
 
diff --git a/_docs/install/installing-drill-in-embedded-mode/011-running-drill-on-docker.md
b/_docs/install/installing-drill-in-embedded-mode/011-running-drill-on-docker.md
index 471999d..f3e49b2 100644
--- a/_docs/install/installing-drill-in-embedded-mode/011-running-drill-on-docker.md
+++ b/_docs/install/installing-drill-in-embedded-mode/011-running-drill-on-docker.md
@@ -1,12 +1,12 @@
 ---
 title: "Running Drill on Docker"
-date: 2019-05-02
+date: 2019-05-24
 parent: "Installing Drill in Embedded Mode"
 ---  
 
-Starting in Drill 1.14, you can run Drill in a [Docker container](https://www.docker.com/what-container#/package_software).
Running Drill in a container is the simplest way to start using Drill; all you need is the
Docker client installed on your machine. You simply run a Docker command, and your Docker
client downloads the Drill Docker image from the apache-drill repository on [Docker Hub](https://docs.docker.com/docker-hub/)
and then brings up a container with Apache Drill  running in  [...]
+Starting in Drill 1.14, you can run Drill in a [Docker container](https://www.docker.com/what-container#/package_software).
Running Drill in a container is the simplest way to start using Drill; all you need is the
Docker client installed on your machine. You simply run a Docker command, and your Docker
client downloads the Drill Docker image from the apache-drill repository on [Docker Hub](https://docs.docker.com/docker-hub/)
and brings up a container with Apache Drill running in embedd [...]
 
-**Note:** Currently, you can only run Drill in embedded mode in a Docker container. Embedded
mode is when a single instance of Drill runs on a node or in a container. You do not have
to perform any configuration tasks when Drill runs in embedded mode.  
+Currently, you can only run Drill in embedded mode in a Docker container. Embedded mode is
when a single instance of Drill runs on a node or in a container. You do not have to perform
any configuration tasks when Drill runs in embedded mode.  
 
 ## Prerequisite  
 
@@ -30,12 +30,12 @@ The following table describes the options:
 | `-t`                           | Allocates a pseudo-tty (a shell).                    
                                                                                         
                                                                                         
                                                                               |
 | `--name`                       | Identifies the container. If you do not use this   option
to identify a name for the container, the daemon generates a container ID for you. When you
use this option to identify a container name,   you can use the name to reference the container
within a Docker network in   foreground or detached mode.  |
 | `-p`                           | The TCP port for the Drill Web UI. If needed, you can
  change this port using the `drill.exec.http.port` [start-up option]({{site.baseurl}}/docs/start-up-options/).
                                                                                         
                                                                                         
            |
-| `drill/apache-drill:<version>` | The Docker Hub repository and tag. In the following
  example, `drill/apache-drill` is   the repository and `1.15.0`   is the tag:     `drill/apache-drill:1.16.0`
    The tag correlates with the version of Drill. When a new version of Drill   is available,
you can use the new version as the tag.                           |
+| `drill/apache-drill:<version>` | The Docker Hub repository and tag. In the following
  example, `drill/apache-drill` is   the repository and `1.16.0`   is the tag:     `drill/apache-drill:1.16.0`
    The tag correlates with the version of Drill. When a new version of Drill   is available,
you can use the new version as the tag.                           |
 | `bin/bash`                     | Connects to the Drill container using a bash shell.  
                                                                                         
                                                                                         
                                                               |  
 
 ### Running the Drill Docker Container in Foreground Mode  
 
-Open a terminal window (Command Prompt or PowerShell, but not PowerShell ISE) and then issue
the following command and opitons to connect to SQLLine (the Drill shell):   
+Open a terminal window (Command Prompt or PowerShell, but not PowerShell ISE) and then issue
the following command and options to connect to SQLLine (the Drill shell):   
 
        docker run -i --name drill-1.16.0 -p 8047:8047 -t drill/apache-drill:1.16.0 /bin/bash
 
 
@@ -43,7 +43,7 @@ When you issue the docker run command, the Drill process starts in a container.
 
        Jun 29, 2018 3:28:21 AM org.glassfish.jersey.server.ApplicationHandler initialize
        INFO: Initiating Jersey application, version Jersey: 2.8 2014-04-29 01:25:26...
-       apache drill 1.15.0 
+       apache drill 1.16.0 
        "json ain't no thang"
        0: jdbc:drill:zk=local>  
 
@@ -67,7 +67,7 @@ Open a terminal window (Command Prompt or PowerShell, but not PowerShell
ISE) an
 
 After you issue the commands, the Drill process starts in a container. SQLLine prints a message,
and the prompt appears:  
 
-       apache drill 1.15.0 
+       apache drill 1.16.0 
        "json ain't no thang"
        0: jdbc:drill:drillbit=localhost>  
 
diff --git a/_docs/rn/005-1.16.0-rn.md b/_docs/rn/005-1.16.0-rn.md
index 4f04272..ac07ab7 100644
--- a/_docs/rn/005-1.16.0-rn.md
+++ b/_docs/rn/005-1.16.0-rn.md
@@ -20,7 +20,9 @@ This release of Drill provides the following new features and improvements:
 - [Format plugin for LTSV files]({{site.baseurl}}/docs/ltsv-format-plugin/) ([DRILL-7014](https://issues.apache.org/jira/browse/DRILL-7014))
 
 - Ability to query Hive views, like querying Hive tables in a hive schema, for example `SELECT
* FROM hive.`hive_view`; ([DRILL-540](https://issues.apache.org/jira/browse/DRILL-540))
 - [Upgrade to SQLLine 1.7]({{site.baseurl}}/docs/configuring-the-drill-shell/) changes the
default prompt to `apache drill (schema_name)>` or you can define a custom prompt using
the command `!set prompt <new-prompt>`. ([DRILL-6989](https://issues.apache.org/jira/browse/DRILL-6989))

-- Calcite updated to version 1.18.0 ([DRILL-6862](https://issues.apache.org/jira/browse/DRILL-6862))
   
+- Calcite updated to version 1.18.0 ([DRILL-6862](https://issues.apache.org/jira/browse/DRILL-6862))
  
+- A new maprdb format plugin option, `readTimestampWithZoneOffset`, converts timestamp values
from UTC to local time zone when values are read from MapR Database. This option is disabled
by default. ([DRILL-6969](https://issues.apache.org/jira/browse/DRILL-6969))  
+- A new Drill configuration option, `store.hive.maprdb_json.read_timestamp_with_timezone_offset`,
enables Drill to read timestamp values with a timezone offset when using the hive plugin with
the Drill native MaprDB JSON reader enabled. This option is disabled by default. ([DRILL-6969](https://issues.apache.org/jira/browse/DRILL-6969))
 
 - Several Drill Web UI improvements, including:
 	- [Storage plugin management improvements](https://drill.apache.org/docs/configuring-storage-plugins/#exporting-storage-plugin-configurations)
([DRILL-6562](https://issues.apache.org/jira/browse/DRILL-6562))  
 	- [Query progress indicators and warnings ]({{site.baseurl}}/docs/query-profiles/#query-profile-warnings)
([DRILL-6879](https://issues.apache.org/jira/browse/DRILL-6879))
diff --git a/_docs/sql-reference/sql-commands/011-refresh-table-metadata.md b/_docs/sql-reference/sql-commands/011-refresh-table-metadata.md
index 3e71ebc..ebfbb28 100644
--- a/_docs/sql-reference/sql-commands/011-refresh-table-metadata.md
+++ b/_docs/sql-reference/sql-commands/011-refresh-table-metadata.md
@@ -1,6 +1,6 @@
 ---
 title: "REFRESH TABLE METADATA"
-date: 2019-04-30
+date: 2019-05-24
 parent: "SQL Commands"
 ---
 Run the REFRESH TABLE METADATA command on Parquet tables and directories to generate a metadata
cache file. REFRESH TABLE METADATA collects metadata from the footers of Parquet files and
writes the metadata to a metadata file (`.drill.parquet_file_metadata.v4`) and a summary file
(`.drill.parquet_summary_metadata.v4`). The planner uses the metadata cache file to prune
extraneous data during the query planning phase. Run the REFRESH TABLE METADATA command if
planning time is a significant [...]
@@ -69,7 +69,10 @@ Enables filter pushdown optimization for Parquet files. Drill reads the
file met
 Sets the number of row groups that a table can have. You can increase the threshold if the
filter can prune many row groups. However, if this setting is too high, the filter evaluation
overhead increases. Base this setting on the data set. Reduce this setting if the planning
time is significant or you do not see any benefit at runtime. Default is 10000.  (Drill 1.9+)
 
 
 ## Limitations
-Currently, Drill does not support runtime rowgroup pruning. 
+
+
+- Drill does not support runtime rowgroup pruning.  
+- REFRESH TABLE METADATA does not count null values for decimal, varchar, and interval data
types.
 
 
 ## Examples  
diff --git a/_docs/sql-reference/sql-functions/020-data-type-conversion.md b/_docs/sql-reference/sql-functions/020-data-type-conversion.md
index 675ebc3..589f1bc 100644
--- a/_docs/sql-reference/sql-functions/020-data-type-conversion.md
+++ b/_docs/sql-reference/sql-functions/020-data-type-conversion.md
@@ -1,6 +1,6 @@
 ---
 title: "Data Type Conversion"
-date: 2019-02-19
+date: 2019-05-24
 parent: "SQL Functions"
 ---
 Drill supports the following functions for casting and converting data types:
@@ -10,7 +10,7 @@ Drill supports the following functions for casting and converting data types:
 * [STRING_BINARY]({{ site.baseurl }}/docs/data-type-conversion/#string_binary-function) and
[BINARY_STRING]({{ site.baseurl }}/docs/data-type-conversion/#binary_string-function)
 * [Other Data Type Conversions]({{ site.baseurl }}/docs/data-type-conversion/#other-data-type-conversions)
 
 
-**Note:** Starting in Drill 1.15, all cast and data type conversion functions return null
for an empty string ('') when the `drill.exec.functions.cast_empty_string_to_null` option
is enabled, for example:    
+Starting in Drill 1.15, all cast and data type conversion functions return null for an empty
string ('') when the `drill.exec.functions.cast_empty_string_to_null` option is enabled, for
example:    
 
 	SELECT CAST('' AS DATE), TO_TIMESTAMP('', 'yyyy-MM-dd HH:mm:ss') FROM (VALUES(2));
 	+---------+---------+
@@ -897,10 +897,50 @@ Convert a UTC date to a timestamp offset from the UTC time zone code.
     +------------------------+---------+
     | 2015-03-30 20:49:00.0  | UTC     |
     +------------------------+---------+
-    1 row selected (0.148 seconds)
+    1 row selected (0.148 seconds)  
 
-## Time Zone Limitation
-Currently Drill does not support conversion of a date, time, or timestamp from one time zone
to another. Queries of data associated with a time zone can return inconsistent results or
an error. For more information, see the ["Understanding Drill's Timestamp and Timezone"](http://www.openkb.info/2015/05/understanding-drills-timestamp-and.html#.VUzhotpVhHw)
blog. The Drill time zone is based on the operating system time zone unless you override it.
To work around the limitation, configure  [...]
+## Enabling Time Zone Offset   
+
+Starting in Drill 1.16, the `store.hive.maprdb_json.read_timestamp_with_timezone_offset`
option enables Drill to read timestamp values with a timezone offset when using the hive plugin
with the Drill native MaprDB JSON reader enabled through the  `store.hive.maprdb_json.optimize_scan_with_native_reader
option`. The `store.hive.maprdb_json.read_timestamp_with_timezone_offset` option is disabled
(set to 'false') by default. You can enable this option from the Options page in the Drill
Web  [...]
+
+**Important**  
+Internally, Drill stores timestamp values in UTC format, for example 2018-01-01T20:12:12.123Z.
When you enable the timezone offset option, select on a table returns different timestamp
values. If you filter on timestamp values when this option is enabled, you must include the
new timestamp value in the filter condition. 
+
+For example, look at the timestamp values when the `store.hive.maprdb_json.read_timestamp_with_timezone_offset`
option is disabled (set to 'false'):   
+
+
+	select * from dfs.`/tmp/timestamp`;
+	-------------------------------------------------------
+	_id	datestring	datetimestamp
+	-------------------------------------------------------
+	1	2018-01-01 12:12:12.123	2018-01-01 20:12:12.123
+	2	9999-12-31 23:59:59.999	10000-01-01 07:59:59.999
+	-------------------------------------------------------  
+
+When the option is enabled (set to 'true'), you can see the difference in the timestamp values
returned:  
+
+	select * from dfs.`/tmp/timestamp`;
+	------------------------------------------------------
+	_id	datestring	datetimestamp
+	------------------------------------------------------
+	1	2018-01-01 12:12:12.123	2018-01-01 12:12:12.123
+	2	9999-12-31 23:59:59.999	9999-12-31 23:59:59.999
+	------------------------------------------------------  
+
+When the option is enabled, queries that filter on timestamp values must include the new
timestamp value in the filter condition, as shown:  
+
+	select * from dfs.`/tmp/timestamp` where datetimestamp=timestamp '2018-01-01 12:12:12.123';
+	------------------------------------------------------
+	_id	datestring	datetimestamp
+	------------------------------------------------------
+	1	2018-01-01 12:12:12.123	2018-01-01 12:12:12.123
+	------------------------------------------------------  
+
+Notice that the WHERE clause uses the `2018-01-01 12:12:12.123` format versus the `2018-01-01
20:12:12.123` format.
+
+## Time Zone Limitation  
+
+Drill does not support conversion of a date, time, or timestamp from one time zone to another.
Queries of data associated with a time zone can return inconsistent results or an error. For
more information, see the ["Understanding Drill's Timestamp and Timezone"](http://www.openkb.info/2015/05/understanding-drills-timestamp-and.html#.VUzhotpVhHw)
blog. The Drill time zone is based on the operating system time zone unless you override it.
To work around the limitation, configure Drill to u [...]
 
 1. Take a look at the Drill time zone configuration by running the TIMEOFDAY function or
by querying the system.options table. This TIMEOFDAY function returns the local date and time
with time zone information. 
 
@@ -941,7 +981,9 @@ You can use the ā€˜zā€™ option to identify the time zone in TO_TIMESTAMP
to make
     +------------------------+-----------+
     | 2015-03-30 20:49:00.0  | UTC       |
     +------------------------+-----------+
-    1 row selected (0.097 seconds)
+    1 row selected (0.097 seconds)  
+
+
 
 <!-- DRILL-448 Support timestamp with time zone -->
 


Mime
View raw message