spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zhiliang Zhu <>
Subject Re: test - what is the wrong while adding one column in the dataframe
Date Fri, 17 Jun 2016 05:48:37 GMT
just for test, since it seemed that the user email system was something wrong ago, is okay

    On Friday, June 17, 2016 12:18 PM, Zhiliang Zhu <> wrote:


     On Tuesday, May 17, 2016 10:44 AM, Zhiliang Zhu <> wrote:

  Hi All,
For the given DataFrame created by hive sql, however, then it is required to add one more
column based on the existing column, and should also keep the previous columns there for the
result DataFrame.

final double DAYS_30 = 1000 * 60 * 60 * 24 * 30.0;
//DAYS_30 seems difficult to call in the sql ? 
DataFrame behavior_df = jhql.sql("SELECT cast (user_id as double) as user_id, cast (server_timestamp
                   double) as server_timestamp, url, referer, source, app_version,
params FROM log.request");
//it is okay to run, but behavior_df.printSchema() not changed any
behavior_df.withColumn("daysLater30", behavior_df.col("server_timestamp").plus(DAYS_30));

//it is okay to run, but behavior_df.printSchema() only has one column as daysLater30 .//it
would be the schema is with the previous all columns and added one as daysLater30 
behavior_df = behavior_df.withColumn("daysLater30", behavior_df.col("server_timestamp").plus(DAYS_30));
Then, how would do it?
Thank you, 


the issue was resolved.


View raw message