The temp table in metastore can not be shared cross SQLContext instances, since HiveContext is a sub class of SQLContext (inherits all of its functionality), why not using a single HiveContext globally? Is there any specific requirement in your case that you need multiple SQLContext/HiveContext?

 

From: shahab [mailto:shahab.mokari@gmail.com]
Sent: Tuesday, March 3, 2015 9:46 PM
To: Cheng, Hao
Cc: user@spark.apache.org
Subject: Re: Supporting Hive features in Spark SQL Thrift JDBC server

 

You are right ,  CassandraAwareSQLContext is subclass of SQL context. 

 

But I did another experiment, I queried Cassandra using CassandraAwareSQLContext, then I registered the "rdd" as a temp table , next I tried to query it using HiveContext, but it seems that hive context can not see the registered table suing SQL context. Is this a normal case?

 

best,

/Shahab

 

 

On Tue, Mar 3, 2015 at 1:35 PM, Cheng, Hao <hao.cheng@intel.com> wrote:

Hive UDF are only applicable for HiveContext and its subclass instance, is the CassandraAwareSQLContext a direct sub class of HiveContext or SQLContext?

 

From: shahab [mailto:shahab.mokari@gmail.com]
Sent: Tuesday, March 3, 2015 5:10 PM
To: Cheng, Hao
Cc: user@spark.apache.org
Subject: Re: Supporting Hive features in Spark SQL Thrift JDBC server

 

  val sc: SparkContext = new SparkContext(conf)

  val sqlCassContext = new CassandraAwareSQLContext(sc)  // I used some Calliope Cassandra Spark connector

val rdd : SchemaRDD  = sqlCassContext.sql("select * from db.profile " )

rdd.cache

rdd.registerTempTable("profile")

 rdd.first  //enforce caching

     val q = "select  from_unixtime(floor(createdAt/1000)) from profile where sampling_bucket=0 "

     val rdd2 = rdd.sqlContext.sql(q )  

     println ("Result: " + rdd2.first)

 

And I get the following  errors: 

xception in thread "main" org.apache.spark.sql.catalyst.errors.package$TreeNodeException: Unresolved attributes: 'from_unixtime('floor(('createdAt / 1000))) AS c0#7, tree:

Project ['from_unixtime('floor(('createdAt / 1000))) AS c0#7]

 Filter (sampling_bucket#10 = 0)

  Subquery profile

   Project [company#8,bucket#9,sampling_bucket#10,profileid#11,createdat#12L,modifiedat#13L,version#14]

    CassandraRelation localhost, 9042, 9160, normaldb_sampling, profile, org.apache.spark.sql.CassandraAwareSQLContext@778b692d, None, None, false, Some(Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml)

 

at org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$$anonfun$apply$1.applyOrElse(Analyzer.scala:72)

at org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$$anonfun$apply$1.applyOrElse(Analyzer.scala:70)

at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:165)

at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:183)

at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)

at scala.collection.Iterator$class.foreach(Iterator.scala:727)

at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)

at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)

at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)

at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)

at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)

at scala.collection.AbstractIterator.to(Iterator.scala:1157)

at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)

at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)

at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)

at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)

at org.apache.spark.sql.catalyst.trees.TreeNode.transformChildrenDown(TreeNode.scala:212)

at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:168)

at org.apache.spark.sql.catalyst.trees.TreeNode.transform(TreeNode.scala:156)

at org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$.apply(Analyzer.scala:70)

at org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$.apply(Analyzer.scala:68)

at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1$$anonfun$apply$2.apply(RuleExecutor.scala:61)

at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1$$anonfun$apply$2.apply(RuleExecutor.scala:59)

at scala.collection.IndexedSeqOptimized$class.foldl(IndexedSeqOptimized.scala:51)

at scala.collection.IndexedSeqOptimized$class.foldLeft(IndexedSeqOptimized.scala:60)

at scala.collection.mutable.WrappedArray.foldLeft(WrappedArray.scala:34)

at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1.apply(RuleExecutor.scala:59)

at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1.apply(RuleExecutor.scala:51)

at scala.collection.immutable.List.foreach(List.scala:318)

at org.apache.spark.sql.catalyst.rules.RuleExecutor.apply(RuleExecutor.scala:51)

at org.apache.spark.sql.SQLContext$QueryExecution.analyzed$lzycompute(SQLContext.scala:402)

at org.apache.spark.sql.SQLContext$QueryExecution.analyzed(SQLContext.scala:402)

at org.apache.spark.sql.SQLContext$QueryExecution.optimizedPlan$lzycompute(SQLContext.scala:403)

at org.apache.spark.sql.SQLContext$QueryExecution.optimizedPlan(SQLContext.scala:403)

at org.apache.spark.sql.SQLContext$QueryExecution.sparkPlan$lzycompute(SQLContext.scala:407)

at org.apache.spark.sql.SQLContext$QueryExecution.sparkPlan(SQLContext.scala:405)

at org.apache.spark.sql.SQLContext$QueryExecution.executedPlan$lzycompute(SQLContext.scala:411)

at org.apache.spark.sql.SQLContext$QueryExecution.executedPlan(SQLContext.scala:411)

at org.apache.spark.sql.SchemaRDD.collect(SchemaRDD.scala:438)

at org.apache.spark.sql.SchemaRDD.take(SchemaRDD.scala:440)

at org.apache.spark.sql.SchemaRDD.take(SchemaRDD.scala:103)

at org.apache.spark.rdd.RDD.first(RDD.scala:1091)

at boot.SQLDemo$.main(SQLDemo.scala:65)  //my code

at boot.SQLDemo.main(SQLDemo.scala)  //my code

 

On Tue, Mar 3, 2015 at 8:57 AM, Cheng, Hao <hao.cheng@intel.com> wrote:

Can you provide the detailed failure call stack?

 

From: shahab [mailto:shahab.mokari@gmail.com]
Sent: Tuesday, March 3, 2015 3:52 PM
To: user@spark.apache.org
Subject: Supporting Hive features in Spark SQL Thrift JDBC server

 

Hi,

 

According to Spark SQL documentation, "....Spark SQL supports the vast majority of Hive features, such as  User Defined Functions( UDF) ", and one of these UFDs is "current_date()" function, which should be supported.

 

However, i get error when I am using this UDF in my SQL query. There are couple of other UDFs which cause similar error.

 

Am I missing something in my JDBC server ?

 

/Shahab