spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gourav Sengupta <gourav.sengu...@gmail.com>
Subject Re: OrderBy Year and Month is not displaying correctly
Date Mon, 06 Jan 2020 09:35:07 GMT
or just use SQL, which is less verbose, easily readable, and takes care of
all such scenarios. But for some weird reason I have found that people
using data frame API's have a perception that using SQL is less
intelligent. But I think that using less effort to get better output can me
a measure of intelligence.

Regards,
Gourav Sengupta

On Mon, Jan 6, 2020 at 9:23 AM Enrico Minack <mail@enrico.minack.dev> wrote:

> The distinct transformation does not preserve order, you need to distinct
> first, then orderby.
>
> Enrico
>
>
> Am 06.01.20 um 00:39 schrieb Mich Talebzadeh:
>
> Hi,
>
> I am working out monthly outgoing etc from an account and I am using the
> following code
>
> import org.apache.spark.sql.expressions.Window
> val wSpec =
> Window.partitionBy(year(col("transactiondate")),month(col("transactiondate")))
> joint_accounts.
>       select(year(col("transactiondate")).as("Year")
>     , month(col("transactiondate")).as("Month")
>     , sum("moneyin").over(wSpec).cast("DECIMAL(10,2)").as("incoming Per
> Month")
>     , sum("moneyout").over(wSpec).cast("DECIMAL(10,2)").as("outgoing Per
> Month")).
>     *orderBy(year(col("transactiondate")),month(col("transactiondate"))).*
>     distinct.
>     show(1000,false)
>
> This shows as follows:
>
>
> |Year|Month|incoming Per Month|outgoing Per Month|
> +----+-----+------------------+------------------+
> |2019|9    |13958.58          |17920.31          |
> |2019|11   |4032.30           |4225.30           |
> |2020|1    |1530.00           |1426.91           |
> |2019|10   |10029.00          |10067.52          |
> |2019|12   |742.00            |814.49            |
> +----+-----+------------------+------------------+
>
>  however the orderby is not correct as I expect to see 2010 record and
> 2019 records in the order of year and month.
>
> Any suggestions?
>
> Thanks,
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
>

Mime
View raw message