hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thejas M Nair (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-18822) INSERT VALUES - HoS + Steaming File Format
Date Thu, 01 Mar 2018 01:04:00 GMT

    [ https://issues.apache.org/jira/browse/HIVE-18822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16381342#comment-16381342
] 

Thejas M Nair commented on HIVE-18822:
--------------------------------------

This is not exactly what you are asking for, but FYI - [Streaming ingest feature (ACID)|https://cwiki.apache.org/confluence/display/Hive/Streaming+Data+Ingest],
you can get more efficient inserts without running into small files issue. But it needs ORC
file format, and its not SQL api.

> INSERT VALUES - HoS + Steaming File Format
> ------------------------------------------
>
>                 Key: HIVE-18822
>                 URL: https://issues.apache.org/jira/browse/HIVE-18822
>             Project: Hive
>          Issue Type: Improvement
>          Components: HiveServer2
>    Affects Versions: 3.0.0
>            Reporter: BELUGA BEHR
>            Priority: Minor
>
> Please optimize the INSERT VALUES function.  When HoS is being used, and a streaming
format such as TEXT or AVRO are being used, INSERT VALUES statements should be quick.  The
HiveServer2 should pass the vales to the Executor and the Executor should simply append the
data to an existing HDFS file instead of creating a new one.  This will reduce the number
of small files that exist in the file system... or perhaps the HiveServer2 performs the append
without having to first sent the data to the processing engine at all.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message