spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yin Huai (JIRA)" <>
Subject [jira] [Commented] (SPARK-6293) SQLContext.implicits should provide automatic conversion for RDD[Row]
Date Tue, 17 Mar 2015 16:25:39 GMT


Yin Huai commented on SPARK-6293:

[~josephkb] Actually, only a GenericRowWithSchema has a schema. For other kinds of Row implementations,
the schema will be null. I am not sure it is safe to provide this implicit conversion (considering
GenericRowWithSchema is not the only possible implementation of a Row that can be created
by a user).

> SQLContext.implicits should provide automatic conversion for RDD[Row]
> ---------------------------------------------------------------------
>                 Key: SPARK-6293
>                 URL:
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>    Affects Versions: 1.3.0
>            Reporter: Joseph K. Bradley
> When a DataFrame is converted to an RDD[Row], it should be easier to convert it back
to a DataFrame via toDF.  E.g.:
> {code}
> val df: DataFrame = myRDD.toDF("col1", "col2")  // This works for types like RDD[scala.Tuple2[...]]
> val splits = df.rdd.randomSplit(...)
> val split0: RDD[Row] = splits(0)
> val df0 = split0.toDF("col1", "col2") // This fails
> {code}
> The failure happens because SQLContext.implicits does not provide an automatic conversion
for Rows.  (It does handle Products, but Row does not implement Product.)

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message