spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Timur Shenkao <...@timshenkao.su>
Subject Re: This works to filter transactions older than certain months
Date Mon, 28 Mar 2016 19:50:59 GMT
bq. CSV data is stored in an underlying table in Hive (actually created and
populated as an ORC table by Spark)

How is it possible?

On Mon, Mar 28, 2016 at 1:50 AM, Mich Talebzadeh <mich.talebzadeh@gmail.com>
wrote:

> Hi,
>
> A while back I was looking for functional programming to filter out
> transactions older > n months etc.
>
> This turned out to be pretty easy.
>
> I get today's day as follows
>
> var today = sqlContext.sql("SELECT FROM_unixtime(unix_timestamp(),
> 'yyyy-MM-dd') ").collect.apply(0).getString(0)
>
>
> CSV data is stored in an underlying table in Hive (actually created and
> populated as an ORC table by Spark)
>
> HiveContext.sql("use accounts")
> var n = HiveContext.table("nw_10124772")
>
> scala> n.printSchema
> root
>  |-- transactiondate: date (nullable = true)
>  |-- transactiontype: string (nullable = true)
>  |-- description: string (nullable = true)
>  |-- value: double (nullable = true)
>  |-- balance: double (nullable = true)
>  |-- accountname: string (nullable = true)
>  |-- accountnumber: integer (nullable = true)
>
> //
> // Check for historical transactions > 60 months old
> //
> var old: Int = 60
>
> val rs = n.filter(add_months(col("transactiondate"),old) <
> lit(today)).select(lit(today),
> col("transactiondate"),add_months(col("transactiondate"),old)).collect.foreach(println)
>
> [2016-03-27,2011-03-22,2016-03-22]
> [2016-03-27,2011-03-22,2016-03-22]
> [2016-03-27,2011-03-22,2016-03-22]
> [2016-03-27,2011-03-22,2016-03-22]
> [2016-03-27,2011-03-23,2016-03-23]
> [2016-03-27,2011-03-23,2016-03-23]
>
>
> Which seems to work. Any other suggestions will be appreciated.
>
> Thanks
>
>
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
>

Mime
View raw message