spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sun Rui <sunrise_...@163.com>
Subject Re: How to convert from DataFrame to Dataset[Row]?
Date Sat, 16 Jul 2016 08:19:42 GMT
For Spark 1.6.x, a DataFrame can't be directly converted to a Dataset[Row], but can done indirectly
as follows:

import org.apache.spark.sql.catalyst.encoders.RowEncoder
// assume df is a DataFrame
implicit val encoder: ExpressionEncoder[Row]  = RowEncoder(df.schema)
val ds = df.as[Row]

However, it may be more convenient to convert a DataFrame to a Dataset of Tuple or case class
corresponding to the row schema. 

> On Jul 16, 2016, at 03:21, Daniel Barclay <danielbarclay.oss@gmail.com> wrote:
> 
> In Spark 1.6.1, how can I convert a DataFrame to a Dataset[Row]?
> 
> Is there a direct conversion?  (Trying <someDataframe>.as[Row] doesn't work,
> even after importing  <my sqlContext>.implicits._ .)
> 
> Is there some way to map the Rows from the Dataframe into the Dataset[Row]?
> (DataFrame.map would just make another Dataframe, right?)
> 
> 
> Thanks,
> Daniel
> 
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
> 


Mime
View raw message