spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From scwf <wangf...@huawei.com>
Subject Re: Working Formula for Hive 0.13?
Date Mon, 25 Aug 2014 02:11:40 GMT
   I have worked for a branch update the hive version to hive-0.13(by org.apache.hive)---https://github.com/scwf/spark/tree/hive-0.13
I am wondering whether it's ok to make a PR now because hive-0.13 version is not compatible
with hive-0.12 and here i used org.apache.hive.


On 2014/7/29 8:22, Michael Armbrust wrote:
> A few things:
>   - When we upgrade to Hive 0.13.0, Patrick will likely republish the
> hive-exec jar just as we did for 0.12.0
>   - Since we have to tie into some pretty low level APIs it is unsurprising
> that the code doesn't just compile out of the box against 0.13.0
>   - ScalaReflection is for determining Schema from Scala classes, not
> reflection based bridge code.  Either way its unclear to if there is any
> reason to use reflection to support multiple versions, instead of just
> upgrading to Hive 0.13.0
>
> One question I have is, What is the goal of upgrading to hive 0.13.0?  Is
> it purely because you are having problems connecting to newer metastores?
>   Are there some features you are hoping for?  This will help me prioritize
> this effort.
>
> Michael
>
>
> On Mon, Jul 28, 2014 at 4:05 PM, Ted Yu <yuzhihong@gmail.com> wrote:
>
>> I was looking for a class where reflection-related code should reside.
>>
>> I found this but don't think it is the proper class for bridging
>> differences between hive 0.12 and 0.13.1:
>>
>> sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala
>>
>> Cheers
>>
>>
>> On Mon, Jul 28, 2014 at 3:41 PM, Ted Yu <yuzhihong@gmail.com> wrote:
>>
>>> After manually copying hive 0.13.1 jars to local maven repo, I got the
>>> following errors when building spark-hive_2.10 module :
>>>
>>> [ERROR]
>>>
>> /homes/xx/spark/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala:182:
>>> type mismatch;
>>>   found   : String
>>>   required: Array[String]
>>> [ERROR]       val proc: CommandProcessor =
>>> CommandProcessorFactory.get(tokens(0), hiveconf)
>>> [ERROR]
>>>     ^
>>> [ERROR]
>>>
>> /homes/xx/spark/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala:60:
>>> value getAllPartitionsForPruner is not a member of org.apache.
>>>   hadoop.hive.ql.metadata.Hive
>>> [ERROR]         client.getAllPartitionsForPruner(table).toSeq
>>> [ERROR]                ^
>>> [ERROR]
>>>
>> /homes/xx/spark/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala:267:
>>> overloaded method constructor TableDesc with alternatives:
>>>    (x$1: Class[_ <: org.apache.hadoop.mapred.InputFormat[_, _]],x$2:
>>> Class[_],x$3:
>> java.util.Properties)org.apache.hadoop.hive.ql.plan.TableDesc
>>> <and>
>>>    ()org.apache.hadoop.hive.ql.plan.TableDesc
>>>   cannot be applied to (Class[org.apache.hadoop.hive.serde2.Deserializer],
>>> Class[(some other)?0(in value tableDesc)(in value tableDesc)],
>> Class[?0(in
>>> value tableDesc)(in   value tableDesc)], java.util.Properties)
>>> [ERROR]   val tableDesc = new TableDesc(
>>> [ERROR]                   ^
>>> [WARNING] Class org.antlr.runtime.tree.CommonTree not found - continuing
>>> with a stub.
>>> [WARNING] Class org.antlr.runtime.Token not found - continuing with a
>> stub.
>>> [WARNING] Class org.antlr.runtime.tree.Tree not found - continuing with a
>>> stub.
>>> [ERROR]
>>>       while compiling:
>>>
>> /homes/xx/spark/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveQl.scala
>>>          during phase: typer
>>>       library version: version 2.10.4
>>>      compiler version: version 2.10.4
>>>
>>> The above shows incompatible changes between 0.12 and 0.13.1
>>> e.g. the first error corresponds to the following method
>>> in CommandProcessorFactory :
>>>    public static CommandProcessor get(String[] cmd, HiveConf conf)
>>>
>>> Cheers
>>>
>>>
>>> On Mon, Jul 28, 2014 at 1:32 PM, Steve Nunez <snunez@hortonworks.com>
>>> wrote:
>>>
>>>> So, do we have a short-term fix until Hive 0.14 comes out? Perhaps
>> adding
>>>> the hive-exec jar to the spark-project repo? It doesn¹t look like
>> there¹s
>>>> a release date schedule for 0.14.
>>>>
>>>>
>>>>
>>>> On 7/28/14, 10:50, "Cheng Lian" <lian.cs.zju@gmail.com> wrote:
>>>>
>>>>> Exactly, forgot to mention Hulu team also made changes to cope with
>> those
>>>>> incompatibility issues, but they said that¹s relatively easy once the
>>>>> re-packaging work is done.
>>>>>
>>>>>
>>>>> On Tue, Jul 29, 2014 at 1:20 AM, Patrick Wendell <pwendell@gmail.com>
>>>>
>>>>> wrote:
>>>>>
>>>>>> I've heard from Cloudera that there were hive internal changes
>> between
>>>>>> 0.12 and 0.13 that required code re-writing. Over time it might be
>>>>>> possible for us to integrate with hive using API's that are more
>>>>>> stable (this is the domain of Michael/Cheng/Yin more than me!). It
>>>>>> would be interesting to see what the Hulu folks did.
>>>>>>
>>>>>> - Patrick
>>>>>>
>>>>>> On Mon, Jul 28, 2014 at 10:16 AM, Cheng Lian <lian.cs.zju@gmail.com>
>>>>>> wrote:
>>>>>>> AFAIK, according a recent talk, Hulu team in China has built
Spark
>>>> SQL
>>>>>>> against Hive 0.13 (or 0.13.1?) successfully. Basically they also
>>>>>>> re-packaged Hive 0.13 as what the Spark team did. The slides
of the
>>>>>> talk
>>>>>>> hasn't been released yet though.
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Jul 29, 2014 at 1:01 AM, Ted Yu <yuzhihong@gmail.com>
>> wrote:
>>>>>>>
>>>>>>>> Owen helped me find this:
>>>>>>>> https://issues.apache.org/jira/browse/HIVE-7423
>>>>>>>>
>>>>>>>> I guess this means that for Hive 0.14, Spark should be able
to
>>>>>> directly
>>>>>>>> pull in hive-exec-core.jar
>>>>>>>>
>>>>>>>> Cheers
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Jul 28, 2014 at 9:55 AM, Patrick Wendell <
>>>> pwendell@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> It would be great if the hive team can fix that issue.
If not,
>>>>>> we'll
>>>>>>>>> have to continue forking our own version of Hive to change
the
>> way
>>>>>> it
>>>>>>>>> publishes artifacts.
>>>>>>>>>
>>>>>>>>> - Patrick
>>>>>>>>>
>>>>>>>>> On Mon, Jul 28, 2014 at 9:34 AM, Ted Yu <yuzhihong@gmail.com>
>>>>>> wrote:
>>>>>>>>>> Talked with Owen offline. He confirmed that as of
0.13,
>>>>>> hive-exec is
>>>>>>>>> still
>>>>>>>>>> uber jar.
>>>>>>>>>>
>>>>>>>>>> Right now I am facing the following error building
against
>> Hive
>>>>>> 0.13.1
>>>>>>>> :
>>>>>>>>>>
>>>>>>>>>> [ERROR] Failed to execute goal on project spark-hive_2.10:
>> Could
>>>>>> not
>>>>>>>>>> resolve dependencies for project
>>>>>>>>>> org.apache.spark:spark-hive_2.10:jar:1.1.0-SNAPSHOT:
The
>>>>>> following
>>>>>>>>>> artifacts could not be resolved:
>>>>>>>>>> org.spark-project.hive:hive-metastore:jar:0.13.1,
>>>>>>>>>> org.spark-project.hive:hive-exec:jar:0.13.1,
>>>>>>>>>> org.spark-project.hive:hive-serde:jar:0.13.1: Failure
to find
>>>>>>>>>> org.spark-project.hive:hive-metastore:jar:0.13.1
in
>>>>>>>>>> http://repo.maven.apache.org/maven2 was cached in
the local
>>>>>>>> repository,
>>>>>>>>>> resolution will not be reattempted until the update
interval
>> of
>>>>>>>>> maven-repo
>>>>>>>>>> has elapsed or updates are forced -> [Help 1]
>>>>>>>>>>
>>>>>>>>>> Some hint would be appreciated.
>>>>>>>>>>
>>>>>>>>>> Cheers
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, Jul 28, 2014 at 9:15 AM, Sean Owen <
>> sowen@cloudera.com>
>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Yes, it is published. As of previous versions,
at least,
>>>>>> hive-exec
>>>>>>>>>>> included all of its dependencies *in its artifact*,
making it
>>>>>> unusable
>>>>>>>>>>> as-is because it contained copies of dependencies
that clash
>>>>>> with
>>>>>>>>>>> versions present in other artifacts, and can't
be managed
>> with
>>>>>> Maven
>>>>>>>>>>> mechanisms.
>>>>>>>>>>>
>>>>>>>>>>> I am not sure why hive-exec was not published
normally, with
>>>>>> just
>>>>>> its
>>>>>>>>>>> own classes. That's why it was copied, into an
artifact with
>>>>>> just
>>>>>>>>>>> hive-exec code.
>>>>>>>>>>>
>>>>>>>>>>> You could do the same thing for hive-exec 0.13.1.
>>>>>>>>>>> Or maybe someone knows that it's published more
'normally'
>> now.
>>>>>>>>>>> I don't think hive-metastore is related to this
question?
>>>>>>>>>>>
>>>>>>>>>>> I am no expert on the Hive artifacts, just remembering
what
>> the
>>>>>> issue
>>>>>>>>>>> was initially in case it helps you get to a similar
solution.
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Jul 28, 2014 at 4:47 PM, Ted Yu <yuzhihong@gmail.com
>>>
>>>>>> wrote:
>>>>>>>>>>>> hive-exec (as of 0.13.1) is published here:
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>>
>>>>
>> http://search.maven.org/#artifactdetails%7Corg.apache.hive%7Chive-exec%7C
>>>>>> 0.13.1%7Cjar
>>>>>>>>>>>>
>>>>>>>>>>>> Should a JIRA be opened so that dependency
on
>> hive-metastore
>>>>>> can
>>>>>> be
>>>>>>>>>>>> replaced by dependency on hive-exec ?
>>>>>>>>>>>>
>>>>>>>>>>>> Cheers
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Jul 28, 2014 at 8:26 AM, Sean Owen
>>>>>> <sowen@cloudera.com>
>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> The reason for org.spark-project.hive
is that Spark relies
>>>> on
>>>>>>>>>>>>> hive-exec, but the Hive project does
not publish this
>>>>>> artifact
>>>>>> by
>>>>>>>>>>>>> itself, only with all its dependencies
as an uber jar.
>> Maybe
>>>>>> that's
>>>>>>>>>>>>> been improved. If so, you need to point
at the new
>> hive-exec
>>>>>> and
>>>>>>>>>>>>> perhaps sort out its dependencies manually
in your build.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Jul 28, 2014 at 4:01 PM, Ted
Yu <
>>>> yuzhihong@gmail.com>
>>>>>>>> wrote:
>>>>>>>>>>>>>> I found 0.13.1 artifacts in maven:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>>
>>>>
>> http://search.maven.org/#artifactdetails%7Corg.apache.hive%7Chive-metasto
>>>>>> re%7C0.13.1%7Cjar
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> However, Spark uses groupId of org.spark-project.hive,
>> not
>>>>>>>>>>>>> org.apache.hive
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Can someone tell me how it is supposed
to work ?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Cheers
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Mon, Jul 28, 2014 at 7:44 AM,
Steve Nunez <
>>>>>>>>> snunez@hortonworks.com>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I saw a note earlier, perhaps
on the user list, that at
>>>>>> least
>>>>>>>> one
>>>>>>>>>>>>> person is
>>>>>>>>>>>>>>> using Hive 0.13. Anyone got a
working build
>> configuration
>>>>>> for
>>>>>>>> this
>>>>>>>>>>>>> version
>>>>>>>>>>>>>>> of Hive?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>> - Steve
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>> CONFIDENTIALITY NOTICE
>>>>>>>>>>>>>>> NOTICE: This message is intended
for the use of the
>>>>>> individual
>>>>>>>> or
>>>>>>>>>>>>> entity to
>>>>>>>>>>>>>>> which it is addressed and may
contain information that
>> is
>>>>>>>>>>> confidential,
>>>>>>>>>>>>>>> privileged and exempt from disclosure
under applicable
>>>>>> law.
>>>>>> If
>>>>>>>> the
>>>>>>>>>>>>> reader
>>>>>>>>>>>>>>> of this message is not the intended
recipient, you are
>>>>>> hereby
>>>>>>>>>>> notified
>>>>>>>>>>>>> that
>>>>>>>>>>>>>>> any printing, copying, dissemination,
distribution,
>>>>>> disclosure
>>>>>>>> or
>>>>>>>>>>>>>>> forwarding of this communication
is strictly
>> prohibited.
>>>>>> If
>>>>>> you
>>>>>>>>> have
>>>>>>>>>>>>>>> received this communication in
error, please contact
>> the
>>>>>> sender
>>>>>>>>>>>>> immediately
>>>>>>>>>>>>>>> and delete it from your system.
Thank You.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> CONFIDENTIALITY NOTICE
>>>> NOTICE: This message is intended for the use of the individual or entity
>>>> to
>>>> which it is addressed and may contain information that is confidential,
>>>> privileged and exempt from disclosure under applicable law. If the
>> reader
>>>> of this message is not the intended recipient, you are hereby notified
>>>> that
>>>> any printing, copying, dissemination, distribution, disclosure or
>>>> forwarding of this communication is strictly prohibited. If you have
>>>> received this communication in error, please contact the sender
>>>> immediately
>>>> and delete it from your system. Thank You.
>>>>
>>>
>>>
>>
>



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Mime
View raw message