hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
Subject [jira] [Commented] (HIVE-20020) Hive contrib jar should not be in lib
Date Thu, 28 Jun 2018 13:12:00 GMT


BELUGA BEHR commented on HIVE-20020:

Just to echo what [~johndee] said in regards to the {{MultiDelimitSerDe}} SerDe, it is confusing
because as it stands, the following scenarios exists:

# Create table with Serde (/)
# Execute SELECT * FROM <table> LIMIT 10 (/)
# Execute SELECT * FROM TABLE WHERE ... LIMIT 10 (x)

This is very confusing and inconsistent.  The last one fails because the first two operations
do not require a MapReduce/Spark job.  All of the work happens with HS2 and it has access
to the hive-contrib JAR in its classpath, but the JAR file is not sent along into the cluster
for MapReduce/Spark jobs.

> Hive contrib jar should not be in lib
> -------------------------------------
>                 Key: HIVE-20020
>                 URL:
>             Project: Hive
>          Issue Type: Improvement
>          Components: Contrib
>            Reporter: Johndee Burks
>            Priority: Trivial
> Currently the way hive is packaged it includes hive-contrib-<version>.jar in lib,
we should not include it here because it is picked up by services like HS2. This creates a
situation in which experimental features such as the [MultiDelimitSerDe|] are
accessible without understanding how to really install and use it. For example you can create
a table using HS2 via beeline with the aforementioned SerDe and it will work as long you do
not do M/R jobs. The M/R jobs do not work because the SerDe is not in aux to get shipped into
distcache. I propose we do not package it this way and if someone would like to leverage an
experimental feature they can add it manually to their environment. 

This message was sent by Atlassian JIRA

View raw message