spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eran Medan <>
Subject Re: [spark-sql] What is the right way to represent an “Any” type in Spark SQL?
Date Sun, 29 Mar 2015 22:27:05 GMT
Thanks Michael!
Can you please point me to the docs / source location for that automatic
casting? I'm just using it to extract the data and put it in a Map[String,
Any] (long story on the reason...) so I think the casting rules won't
"know" what to cast it to... right? I guess I can have the JSON / parquet
data store it as a string and also have metadata on the "Real" type, but
then it feels a little wrong. Is that the only way to handle it? or perhaps
there is a way to support an "Any" after all? is it just not implemented or
is it a Hive limitation? (I never used Hive other than here, so sorry for
the silly question)

p.s. I fixed the PR based on the code review, but the tests failed due to
GitHub's ongoing DDOS attack, is there a way to restart the tests? :) (or
should I just do a new commit with a white space char to trigger it?)

Thanks again, you guys are great!

On Sat, Mar 28, 2015 at 11:29 PM, Michael Armbrust <>

> In this case I'd probably just store it as a String.  Our casting rules
> (which come from Hive) are such that when you use a string as an number of
> boolean it will be casted to the desired type.
> Thanks for the PR btw :)
> On Fri, Mar 27, 2015 at 2:31 PM, Eran Medan <>
> wrote:
>> Hi everyone,
>> I had a lot of questions today, sorry if I'm spamming the list, but I
>> thought it's better than posting all questions in one thread. Let me know
>> if I should throttle my posts ;)
>> Here is my question:
>> When I try to have a case class that has Any in it (e.g. I have a
>> property map and values can be either String, Int or Boolean, and since we
>> don't have union types, Any is the closest thing)
>> When I try to register such an RDD as a table in 1.2.1 (or convert to
>> DataFrame in 1.3 and then register as a table)
>> I get this weird exception:
>> Exception in thread "main" scala.MatchError: Any (of class
>> scala.reflect.internal.Types$ClassNoArgsTypeRef) at
>> org.apache.spark.sql.catalyst.ScalaReflection$class.schemaFor(ScalaReflection.scala:112)
>> Which from my interpretaion simply means that Any is not a valid type
>> that Spark SQL can support in it's schema
>> I already sent a pull request <> to
>> solve the cryptic exception but my question is - *is there a way to
>> support an "Any" type in Spark SQL?*
>> disclaimer - also posted at

View raw message