[ https://issues.apache.org/jira/browse/HIVE-19103?focusedWorklogId=446685&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-446685
]
ASF GitHub Bot logged work on HIVE-19103:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 16/Jun/20 16:54
Start Date: 16/Jun/20 16:54
Worklog Time Spent: 10m
Work Description: github-actions[bot] commented on pull request #330:
URL: https://github.com/apache/hive/pull/330#issuecomment-644886703
This pull request has been automatically marked as stale because it has not had recent
activity. It will be closed if no further activity occurs.
Feel free to reach out on the dev@hive.apache.org list if the patch is in need of reviews.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Issue Time Tracking
-------------------
Worklog Id: (was: 446685)
Remaining Estimate: 0h
Time Spent: 10m
> Nested structure Projection Push Down in Hive with ORC
> ------------------------------------------------------
>
> Key: HIVE-19103
> URL: https://issues.apache.org/jira/browse/HIVE-19103
> Project: Hive
> Issue Type: Improvement
> Components: Hive, ORC
> Reporter: Ashish Sharma
> Assignee: Ashish Sharma
> Priority: Critical
> Labels: pull-request-available
> Attachments: HIVE-19103.2.patch, HIVE-19103.3.patch, HIVE-19103.patch
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Reading required columns only in nested structure schema
> Example -
> *Current state* -
> Schema - struct<a:int, b:bigint,c:struct<d:int,e:struct<f:int>,g:string>>
> Query - select c.e.f from t where c.e.f > 10;
> Current state - read entire c struct from the file and then filter because "hive.io.file.readcolumn.ids"
is referred due to which all the children column are select to read from the file.
> Conf -
> _hive.io.file.readcolumn.ids = "2"
> hive.io.file.readNestedColumn.paths = "c.e.f"_
> Result -
> boolean[ ] include = [true,false,false,true,true,true,true,true]
> *Expected state* -
> Schema - struct<a:int, b:bigint,c:struct<d:int,e:struct<f:int>,g:string>>
> Query - select c.e.f from t where c.e.f > 10;
> Expected state - instead of reading entire c struct from the file just read only the
f column by referring the " hive.io.file.readNestedColumn.paths".
> Conf -
> _hive.io.file.readcolumn.ids = "2"
> hive.io.file.readNestedColumn.paths = "c.e.f"_
> Result -
> boolean[ ] include = [true,false,false,true,false,true,true,false]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
|