hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vihang Karajgaonkar (JIRA)" <>
Subject [jira] [Commented] (HIVE-17580) Remove dependency of get_fields_with_environment_context API to serde
Date Sun, 03 Dec 2017 09:56:00 GMT


Vihang Karajgaonkar commented on HIVE-17580:

[~sershe], [~alangates], [~owen.omalley], [~akolb] and myself had a offline discussion over
phone on this and we agreed to approach this problem with the following next steps:

1. Standalone-metastore should not have compile time dependency with SerDe and ObjectInspector.
In order to resolve the case for Avro, we will write a parser using Avro API to directly parse
the file/url/property which returns the FieldSchema without using the SerDe. This parser will
implement the interface StorageSchemaReader (see (3)). This means we may need to duplicate
some of the AvroSerde's logic and return List<FieldSchema> for the given schema.url.
This code cannot depend on OI, Deserializer and TypeInfo.

2. If the table belongs to one of the serdes listed in "hive.serdes.using.metastore.for.schema"
metastore will return the FieldSchema from the DB (table.getSd().getCols()). 
This is similar to what we have currently.  In addition to the current implementation we should
expand this list to include all the serdes defined in Hive source code.

3. If there is a table/partition belonging to a custom serde or to a serde which doesn't belong
to the config listed in (2) standalone-metastore will use the interface StorageSchemaReader
to read the schema. 
End-users are responsible for adding the jars in the metastore's classpath so that these custom
serdes work with metastore using this interface.

> Remove dependency of get_fields_with_environment_context API to serde
> ---------------------------------------------------------------------
>                 Key: HIVE-17580
>                 URL:
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Standalone Metastore
>            Reporter: Vihang Karajgaonkar
>            Assignee: Vihang Karajgaonkar
> {{get_fields_with_environment_context}} metastore API uses {{Deserializer}} class to
access the fields metadata for the cases where it is stored along with the data files (avro
tables). The problem is Deserializer classes is defined in hive-serde module and in order
to make metastore independent of Hive we will have to remove this dependency (atleast we should
change it to runtime dependency instead of compile time).
> The other option is investigate if we can use SearchArgument to provide this functionality.

This message was sent by Atlassian JIRA

View raw message