spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dhaval Modi <dhavalmod...@gmail.com>
Subject Re: This simple UDF is not working!
Date Sat, 26 Mar 2016 06:07:19 GMT
Hi Mich,

You can try this:

val toDate = udf{(out:String, form: String) => {
val format = new SimpleDateFormat(s"$form");
Try(new Date(format.parse(out.toString()).getTime))match {
case Success(t) => Some(t)
case Failure(_) => None
}}};

Usage: src = src.withColumn(s"$columnName", toDate(src(s"$columnName"),
lit(s"dd/MM/yyyy")))


Regards,
Dhaval Modi
dhavalmodi24@gmail.com

On 26 March 2016 at 04:34, Mich Talebzadeh <mich.talebzadeh@gmail.com>
wrote:

> Hi Ted,
>
> I decided to take a short cut here. I created the map leaving date as it
> is (p(1)) below
>
> def CleanupCurrency (word : String) : Double = {
> return word.toString.substring(1).replace(",", "").toDouble
> }
> sqlContext.udf.register("CleanupCurrency", CleanupCurrency(_:String))
> val a = df.filter(col("Total") > "").map(p => Invoices(p(0).toString,
> p(1).toString, CleanupCurrency(p(2).toString),
> CleanupCurrency(p(3).toString), CleanupCurrency(p(4).toString)))
>
> //
> // convert this RDD to DF and create a Spark temporary table
> //
> a.toDF.registerTempTable("tmp")
>
> INSERT INTO TABLE <HIVE_TABLE>
> SELECT
>           INVOICENUMBER
>         ,
> TO_DATE(FROM_UNIXTIME(UNIX_TIMESTAMP(paymentdate,'dd/MM/yyyy'),'yyyy-MM-dd'))
> AS paymentdate
>         , NET
>         , VAT
>         , TOTAL
> FROM tmp
> """
> sql(sqltext)
>
> That works OK.
>
> If I want to find invoices with paymentdate > 6 months old I do
>
> sql("SELECT invoicenumber, paymentdate FROM test.t14 *WHERE
> months_between(FROM_unixtime(unix_timestamp(), 'yyyy-MM-dd'), paymentdate)*
> > 6 ORDER BY invoicenumber, paymentdate").collect.foreach(println)
> [360,2014-02-10]
> [361,2014-02-17]
>
> I still interested if I could do it using a UDF :)
>
>
>
>
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
>
> On 25 March 2016 at 17:44, Ted Yu <yuzhihong@gmail.com> wrote:
>
>> Mich:
>> Please take a look at:
>> sql/core/src/test/scala/org/apache/spark/sql/DateFunctionsSuite.scala
>>
>> test("function to_date") {
>>
>> Remember to:
>> import org.apache.spark.sql.functions._
>>
>> On Fri, Mar 25, 2016 at 7:59 AM, Mich Talebzadeh <
>> mich.talebzadeh@gmail.com> wrote:
>>
>>> This works with sql
>>>
>>> sqltext = """
>>> INSERT INTO TABLE t14
>>> SELECT
>>>           INVOICENUMBER
>>>         ,
>>> TO_DATE(FROM_UNIXTIME(UNIX_TIMESTAMP(paymentdate,'dd/MM/yyyy'),'yyyy-MM-dd'))
>>> AS paymentdate
>>>         , NET
>>>         , VAT
>>>         , TOTAL
>>> FROM tmp
>>> """
>>> sql(sqltext)
>>>
>>>
>>> but not in UDF.  I want to convert it to correct date format  before
>>> writing it to table
>>>
>>> Thanks
>>>
>>>
>>>
>>> Dr Mich Talebzadeh
>>>
>>>
>>>
>>> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>>
>>>
>>>
>>> http://talebzadehmich.wordpress.com
>>>
>>>
>>>
>>> On 25 March 2016 at 14:54, Ted Yu <yuzhihong@gmail.com> wrote:
>>>
>>>> Do you mind showing body of TO_DATE() ?
>>>>
>>>> Thanks
>>>>
>>>> On Fri, Mar 25, 2016 at 7:38 AM, Ted Yu <yuzhihong@gmail.com> wrote:
>>>>
>>>>> Looks like you forgot an import for Date.
>>>>>
>>>>> FYI
>>>>>
>>>>> On Fri, Mar 25, 2016 at 7:36 AM, Mich Talebzadeh <
>>>>> mich.talebzadeh@gmail.com> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> writing a UDF to convert  a string into Date
>>>>>>
>>>>>> def ChangeDate(word : String) : Date = {
>>>>>>      | return
>>>>>> TO_DATE(FROM_UNIXTIME(UNIX_TIMESTAMP(word),"dd/MM/yyyy"),"yyyy-MM-dd")
>>>>>>      | }
>>>>>> <console>:19: error: not found: type Date
>>>>>>
>>>>>> That code to_date.. works OK in sql but not here. It is complaining
>>>>>> about to_date?
>>>>>>
>>>>>> Any ideas will be appreciated.
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>>
>>>>>> Dr Mich Talebzadeh
>>>>>>
>>>>>>
>>>>>>
>>>>>> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>>>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>>>>>
>>>>>>
>>>>>>
>>>>>> http://talebzadehmich.wordpress.com
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message