spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ali Ghodsi <a...@cs.berkeley.edu>
Subject Re: Shark Table for >22 columns
Date Sat, 22 Mar 2014 23:07:57 GMT
Subacini, the short answer is that we don't really support that yet, but
the good news is that I can show you how to work around it.

The good thing is that we nowadays internally actually convert the Tuples
to Seqs, so we can actually leverage that. The bad thing is that before
converting tuples to sequences we extract the static type of the different
tuple fields. We need those types when we create the table for you to
automatically setup the schema during saveAsTable().

The way around it is to call the underlying API and supply the types of the
elements of the sequence (beware, this API might change in the future):

// assume "rdd" is of type RDD[Seq[Any]], where the Seq actually consists
of two elements, one Int and one String

val tableObject = new RDDTableFunctions(rdd, Seq(implicitly[ClassTag[Int]],
implicitly[ClassTag[String]]))
tableObject.saveAsTable("mySeqTable", Seq("my_int", "my_string"))

Hope that helps,
Best,
Ali



On Fri, Mar 21, 2014 at 4:53 PM, subacini Arunkumar <subacini@gmail.com>wrote:

> Hi,
>
> I am able to successfully create shark table with 3 columns  and 2 rows.
>
>
>  val recList = List((" value A1", "value B1","value C1"),
>                                  ("value A2", "value B2","value c2"));
>    val dbFields =List ("Col A", "Col B","Col C")
>     val rdd = sc.parallelize(recList)
>     RDDTable(rdd).saveAsTable("table_1", dbFields)
>
>
> I have a scenario where table will have 60 columns. How to achieve it
> using RDDTable.
>
> I tried creating a List[(Seq[String],Seq[String])] , but it throws below
> exception.Any help /pointer will help.
>
> Exception in thread "main" shark.api.DataTypes$UnknownDataTypeException:
> scala.collection.Seq
>     at shark.api.DataTypes.fromClassTag(DataTypes.java:133)
>     at shark.util.HiveUtils$$anonfun$1.apply(HiveUtils.scala:106)
>     at shark.util.HiveUtils$$anonfun$1.apply(HiveUtils.scala:105)
>     at
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
>     at
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
>     at scala.collection.immutable.List.foreach(List.scala:318)
>     at
> scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
>     at scala.collection.AbstractTraversable.map(Traversable.scala:105)
>     at shark.util.HiveUtils$.createTableInHive(HiveUtils.scala:105)
>     at shark.api.RDDTableFunctions.saveAsTable(RDDTableFunctions.scala:63)
>
> Thanks
> Subacini
>

Mime
View raw message