spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Armbrust <mich...@databricks.com>
Subject Re: Spark 1.5.2 : change datatype in programaticallly generated schema
Date Fri, 04 Mar 2016 21:09:17 GMT
Change the type of a subset of the columns using withColumn, after you have
loaded the DataFrame.

Here is an example.
<https://databricks-prod-cloudfront.cloud.databricks.com/public/4027ec902e239c93eaaa8714f173bcfc/1023043053387187/2457334174245135/2840265927289860/2c57e2a43f.html>

On Thu, Mar 3, 2016 at 11:58 PM, Divya Gehlot <divya.htconex@gmail.com>
wrote:

> Hi,
>
> I am generating schema programatically  as show below
> val schemaFile = sc.textFile("/TestDivya/Spark/cars.csv")
>  val schemaString = schemaFile.first()
> val schema =
>       StructType(Array(
>         schemaString.split(" ").map(fieldName => StructField(fieldName,
> StringType(), true))))
>
> I want to change the datatype of year column to int  and for many other
> columns as my schemaString is huge with more than 100+ columns
> any suggestions?
>
>
> Thanks,
> Divya
>

Mime
View raw message