spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pala M Muthaia <mchett...@rocketfuelinc.com>
Subject Re: Issues with constants in Spark HiveQL queries
Date Wed, 28 Jan 2015 11:29:21 GMT
By typo i meant that the column name had a spelling error:
conversion_aciton_id.
It should have been conversion_action_id.

No, we tried it a few times, and we didn't have + signs or anything like
that - we tried it with columns of different types too - string, double etc
and saw the same error.




On Tue, Jan 20, 2015 at 8:59 PM, yana <yana.kadiyska@gmail.com> wrote:

> I run Spark 1.2 and do not have this issue. I dont believe the Hive
> version would matter(I run spark1.2 with Hive12 profile) but that would be
> a good test. The last version I tried for you was a cdh4.2 spark1.2
> prebuilt without pointing to an external hive install(in fact I tried it on
> a machine w/ no other hadoop/hive jars). So download, unzip and run spark
> shell. I dont believe it's a bug personally. When you say typo do you mean
> there was indeed token Plus in your string? If you remove that token what
> stacktrace do you get?
>
>
> Sent on the new Sprint Network from my Samsung Galaxy S®4.
>
>
> -------- Original message --------
> From: Pala M Muthaia
> Date:01/19/2015 8:26 PM (GMT-05:00)
> To: Yana Kadiyska
> Cc: "Cheng, Hao" ,user@spark.apache.org
> Subject: Re: Issues with constants in Spark HiveQL queries
>
> Yes we tried the master branch (sometime last week) and there was no
> issue, but the above repro is for branch 1.2 and Hive 0.13. Isn't that the
> final release branch for Spark 1.2?
>
> If so, a patch needs to be created or back-ported from master?
>
> (Yes the obvious typo in the column name was introduced in this email
> only, so is irrelevant to the error).
>
> On Wed, Jan 14, 2015 at 5:52 PM, Yana Kadiyska <yana.kadiyska@gmail.com>
> wrote:
>
>> yeah, that makes sense. Pala, are you on a prebuild version of Spark -- I
>> just tried the CDH4 prebuilt...Here is what I get for the = token:
>>
>> [image: Inline image 1]
>>
>> The literal type shows as 290, not 291, and 290 is numeric. According to
>> this
>> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hive/hive-exec/0.13.1/org/apache/hadoop/hive/ql/parse/HiveParser.java#HiveParser
>> 291 is token PLUS which is really weird...
>>
>>
>> On Wed, Jan 14, 2015 at 7:47 PM, Cheng, Hao <hao.cheng@intel.com> wrote:
>>
>>>  The log showed it failed in parsing, so the typo stuff shouldn’t be
>>> the root cause. BUT I couldn’t reproduce that with master branch.
>>>
>>>
>>>
>>> I did the test as follow:
>>>
>>>
>>>
>>> sbt/sbt –Phadoop-2.3.0 –Phadoop-2.3 –Phive –Phive-0.13.1 hive/console
>>>
>>> scala> sql(“SELECT user_id FROM actions where
>>> conversion_aciton_id=20141210”)
>>>
>>>
>>>
>>> sbt/sbt –Phadoop-2.3.0 –Phadoop-2.3 –Phive –Phive-0.12.0 hive/console
>>>
>>> scala> sql(“SELECT user_id FROM actions where
>>> conversion_aciton_id=20141210”)
>>>
>>>
>>>
>>>
>>>
>>> *From:* Yana Kadiyska [mailto:yana.kadiyska@gmail.com]
>>> *Sent:* Wednesday, January 14, 2015 11:12 PM
>>> *To:* Pala M Muthaia
>>> *Cc:* user@spark.apache.org
>>> *Subject:* Re: Issues with constants in Spark HiveQL queries
>>>
>>>
>>>
>>> Just a guess but what is the type of conversion_aciton_id? I do queries
>>> over an epoch all the time with no issues(where epoch's type is bigint).
>>> You can see the source here
>>> https://github.com/apache/spark/blob/v1.2.0/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveQl.scala
--
>>> not sure what ASTNode type: 291 but it sounds like it's not considered
>>> numeric? If it's a string it should be conversion_aciton_id=*'*20141210*'
>>> *(single quotes around the string)
>>>
>>>
>>>
>>> On Tue, Jan 13, 2015 at 5:25 PM, Pala M Muthaia <
>>> mchettiar@rocketfuelinc.com> wrote:
>>>
>>>  Hi,
>>>
>>>
>>>
>>> We are testing Spark SQL-Hive QL, on Spark 1.2.0. We have run some
>>> simple queries successfully, but we hit the following issue whenever we
>>> attempt to use a constant in the query predicate.
>>>
>>>
>>>
>>> It seems like an issue with parsing constant.
>>>
>>>
>>>
>>> Query: SELECT user_id FROM actions where conversion_aciton_id=20141210
>>>
>>>
>>>
>>> Error:
>>>
>>> scala.NotImplementedError: No parse rules for ASTNode type: 291, text:
>>> 20141210 :
>>>
>>> 20141210
>>>
>>>
>>>
>>> Any ideas? This seems very basic, so we may be missing something basic,
>>> but i haven't figured out what it is.
>>>
>>>
>>>
>>> ---
>>>
>>>
>>>
>>> Full shell output below:
>>>
>>>
>>>
>>> scala> sqlContext.sql("SELECT user_id FROM actions where
>>> conversion_aciton_id=20141210")
>>>
>>> 15/01/13 16:55:54 INFO ParseDriver: Parsing command: SELECT user_id FROM
>>> actions where conversion_aciton_id=20141210
>>>
>>> 15/01/13 16:55:54 INFO ParseDriver: Parse Completed
>>>
>>> 15/01/13 16:55:54 INFO ParseDriver: Parsing command: SELECT user_id FROM
>>> actions where conversion_aciton_id=20141210
>>>
>>> 15/01/13 16:55:54 INFO ParseDriver: Parse Completed
>>>
>>> java.lang.RuntimeException:
>>>
>>> Unsupported language features in query: SELECT user_id FROM actions
>>> where conversion_aciton_id=20141210
>>>
>>> TOK_QUERY
>>>
>>>   TOK_FROM
>>>
>>>     TOK_TABREF
>>>
>>>       TOK_TABNAME
>>>
>>>         actions
>>>
>>>   TOK_INSERT
>>>
>>>     TOK_DESTINATION
>>>
>>>       TOK_DIR
>>>
>>>         TOK_TMP_FILE
>>>
>>>     TOK_SELECT
>>>
>>>       TOK_SELEXPR
>>>
>>>         TOK_TABLE_OR_COL
>>>
>>>           user_id
>>>
>>>     TOK_WHERE
>>>
>>>       =
>>>
>>>         TOK_TABLE_OR_COL
>>>
>>>           conversion_aciton_id
>>>
>>>         20141210
>>>
>>>
>>>
>>> scala.NotImplementedError: No parse rules for ASTNode type: 291, text:
>>> 20141210 :
>>>
>>> 20141210
>>>
>>> " +
>>>
>>>
>>>
>>> org.apache.spark.sql.hive.HiveQl$.nodeToExpr(HiveQl.scala:1110)
>>>
>>>
>>>
>>> at scala.sys.package$.error(package.scala:27)
>>>
>>> at org.apache.spark.sql.hive.HiveQl$.createPlan(HiveQl.scala:251)
>>>
>>> at
>>> org.apache.spark.sql.hive.ExtendedHiveQlParser$$anonfun$hiveQl$1.apply(ExtendedHiveQlParser.scala:50)
>>>
>>> at
>>> org.apache.spark.sql.hive.ExtendedHiveQlParser$$anonfun$hiveQl$1.apply(ExtendedHiveQlParser.scala:49)
>>>
>>> at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:136)
>>>
>>> at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:135)
>>>
>>> at
>>> scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242)
>>>
>>> at
>>> scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242)
>>>
>>> at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222)
>>>
>>> at
>>> scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$2.apply(Parsers.scala:254)
>>>
>>> at
>>> scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$2.apply(Parsers.scala:254)
>>>
>>> at
>>> scala.util.parsing.combinator.Parsers$Failure.append(Parsers.scala:202)
>>>
>>> at
>>> scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254)
>>>
>>> at
>>> scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254)
>>>
>>> at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222)
>>>
>>> at
>>> scala.util.parsing.combinator.Parsers$$anon$2$$anonfun$apply$14.apply(Parsers.scala:891)
>>>
>>> at
>>> scala.util.parsing.combinator.Parsers$$anon$2$$anonfun$apply$14.apply(Parsers.scala:891)
>>>
>>> at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
>>>
>>> at scala.util.parsing.combinator.Parsers$$anon$2.apply(Parsers.scala:890)
>>>
>>> at
>>> scala.util.parsing.combinator.PackratParsers$$anon$1.apply(PackratParsers.scala:110)
>>>
>>> at
>>> org.apache.spark.sql.catalyst.AbstractSparkSQLParser.apply(SparkSQLParser.scala:31)
>>>
>>> at org.apache.spark.sql.hive.HiveQl$$anonfun$3.apply(HiveQl.scala:133)
>>>
>>> at org.apache.spark.sql.hive.HiveQl$$anonfun$3.apply(HiveQl.scala:133)
>>>
>>> at
>>> org.apache.spark.sql.catalyst.SparkSQLParser$$anonfun$org$apache$spark$sql$catalyst$SparkSQLParser$$others$1.apply(SparkSQLParser.scala:174)
>>>
>>> at
>>> org.apache.spark.sql.catalyst.SparkSQLParser$$anonfun$org$apache$spark$sql$catalyst$SparkSQLParser$$others$1.apply(SparkSQLParser.scala:173)
>>>
>>> at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:136)
>>>
>>> at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:135)
>>>
>>> at
>>> scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242)
>>>
>>> at
>>> scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242)
>>>
>>> at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222)
>>>
>>> at
>>> scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$2.apply(Parsers.scala:254)
>>>
>>> at
>>> scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$2.apply(Parsers.scala:254)
>>>
>>> at
>>> scala.util.parsing.combinator.Parsers$Failure.append(Parsers.scala:202)
>>>
>>> at
>>> scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254)
>>>
>>> at
>>> scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254)
>>>
>>> at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222)
>>>
>>> at
>>> scala.util.parsing.combinator.Parsers$$anon$2$$anonfun$apply$14.apply(Parsers.scala:891)
>>>
>>> at
>>> scala.util.parsing.combinator.Parsers$$anon$2$$anonfun$apply$14.apply(Parsers.scala:891)
>>>
>>> at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
>>>
>>> at scala.util.parsing.combinator.Parsers$$anon$2.apply(Parsers.scala:890)
>>>
>>> at
>>> scala.util.parsing.combinator.PackratParsers$$anon$1.apply(PackratParsers.scala:110)
>>>
>>> at
>>> org.apache.spark.sql.catalyst.AbstractSparkSQLParser.apply(SparkSQLParser.scala:31)
>>>
>>> at org.apache.spark.sql.hive.HiveQl$.parseSql(HiveQl.scala:235)
>>>
>>> at
>>> org.apache.spark.sql.hive.HiveContext$$anonfun$sql$1.apply(HiveContext.scala:94)
>>>
>>> at
>>> org.apache.spark.sql.hive.HiveContext$$anonfun$sql$1.apply(HiveContext.scala:94)
>>>
>>> at scala.Option.getOrElse(Option.scala:120)
>>>
>>> at org.apache.spark.sql.hive.HiveContext.sql(HiveContext.scala:94)
>>>
>>> at $iwC$$iwC$$iwC$$iwC.<init>(<console>:15)
>>>
>>> at $iwC$$iwC$$iwC.<init>(<console>:20)
>>>
>>> at $iwC$$iwC.<init>(<console>:22)
>>>
>>> at $iwC.<init>(<console>:24)
>>>
>>> at <init>(<console>:26)
>>>
>>> at .<init>(<console>:30)
>>>
>>> at .<clinit>(<console>)
>>>
>>> at .<init>(<console>:7)
>>>
>>> at .<clinit>(<console>)
>>>
>>> at $print(<console>)
>>>
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>
>>> at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>
>>> at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>
>>> at java.lang.reflect.Method.invoke(Method.java:606)
>>>
>>> at
>>> org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:852)
>>>
>>> at
>>> org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1125)
>>>
>>> at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:674)
>>>
>>> at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:705)
>>>
>>> at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:669)
>>>
>>> at
>>> org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:828)
>>>
>>> at
>>> org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:873)
>>>
>>> at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:785)
>>>
>>> at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:628)
>>>
>>> at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:636)
>>>
>>> at org.apache.spark.repl.SparkILoop.loop(SparkILoop.scala:641)
>>>
>>> at
>>> org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply$mcZ$sp(SparkILoop.scala:968)
>>>
>>> at
>>> org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply(SparkILoop.scala:916)
>>>
>>> at
>>> org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply(SparkILoop.scala:916)
>>>
>>> at
>>> scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
>>>
>>> at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:916)
>>>
>>> at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1011)
>>>
>>> at org.apache.spark.repl.Main$.main(Main.scala:31)
>>>
>>> at org.apache.spark.repl.Main.main(Main.scala)
>>>
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>
>>> at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>
>>> at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>
>>> at java.lang.reflect.Method.invoke(Method.java:606)
>>>
>>> at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:358)
>>>
>>> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
>>>
>>> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>>>
>>>
>>>
>>
>>
>

Mime
View raw message