hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Harish Jaiprakash (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HIVE-21362) Add an input format and serde to read from protobuf files.
Date Fri, 01 Mar 2019 02:19:00 GMT

    [ https://issues.apache.org/jira/browse/HIVE-21362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16781212#comment-16781212
] 

Harish Jaiprakash edited comment on HIVE-21362 at 3/1/19 2:18 AM:
------------------------------------------------------------------

Thanks, [~jdere].

bq. Is there a RB for this somewhere?


Created one forgot to link: https://reviews.apache.org/r/70075

bq. Does ProtoMessageSerDe.createStructObjectInspector() need to handle repeated struct fields
like createObjectInspector() does?

It does, createStructObjectInspector calls createObjectInspector which in turn calls createStructObjectInspect
for struct types.

bq. Also createObjectInspector() has a call to System.out.println(), please remove that or
convert to debug logging.

Will remove this, was debugging code. 

bq. I don't quite get the proto.maptypes property - is this just some special condition due
to the data format you are trying to read, or is this the only way to specify that a field
is of Map type? If the latter, doesn't the descriptor have a way to specify a map type?

Proto 2 compiler does not support map types. This is a way to configure conversion of repeated
struct(key, value) into map<key, value>. That makes it easier to process in hive too,
no explode followed by a filter is required for this and easier to extract several values
from the map without joins.


was (Author: harishjp):
Thanks, [~jdere].

{noformat}
Is there a RB for this somewhere?
{noformat}

Created one forgot to link: https://reviews.apache.org/r/70075

{noformat}
Does ProtoMessageSerDe.createStructObjectInspector() need to handle repeated struct fields
like createObjectInspector() does?
{noformat}

It does, createStructObjectInspector calls createObjectInspector which in turn calls createStructObjectInspect
for struct types.

{noformat}
Also createObjectInspector() has a call to System.out.println(), please remove that or convert
to debug logging.
{noformat}

Will remove this, was debugging code. 

{noformat}
I don't quite get the proto.maptypes property - is this just some special condition due to
the data format you are trying to read, or is this the only way to specify that a field is
of Map type? If the latter, doesn't the descriptor have a way to specify a map type?
{noformat}

Proto 2 compiler does not support map types. This is a way to configure conversion of repeated
struct(key, value) into map<key, value>. That makes it easier to process in hive too,
no explode followed by a filter is required for this and easier to extract several values
from the map without joins.

> Add an input format and serde to read from protobuf files.
> ----------------------------------------------------------
>
>                 Key: HIVE-21362
>                 URL: https://issues.apache.org/jira/browse/HIVE-21362
>             Project: Hive
>          Issue Type: Task
>          Components: HiveServer2
>            Reporter: Harish Jaiprakash
>            Assignee: Harish Jaiprakash
>            Priority: Critical
>         Attachments: HIVE-21362.01.patch
>
>
> Logs are being generated using the HiveProtoLoggingHook and tez ProtoHistoryLoggingService.
These are sequence files written using ProtobufMessageWritable.
> Implement a SerDe and input format to be able to create tables using these files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message