spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eli Super <eli.su...@gmail.com>
Subject Re: FPGrowth Model is taking too long to generate frequent item sets
Date Mon, 06 Mar 2017 11:29:37 GMT
Hi

Try to implement binning and/or feature engineering (smart feature
selection for example)

Good luck

On Mon, Mar 6, 2017 at 6:56 AM, Raju Bairishetti <raju@apache.org> wrote:

> Hi,
>   I am new to Spark ML Lib. I am using FPGrowth model for finding related
> items.
>
> Number of transactions are 63K and the total number of items in all
> transactions are 200K.
>
> I am running FPGrowth model to generate frequent items sets. It is taking
> huge amount of time to generate frequent itemsets.* I am setting
> min-support value such that each item appears in at least ~(number of
> items)/(number of transactions).*
>
> It is taking lots of time in case If I say item can appear at least once
> in the database.
>
> If I give higher value to min-support then output is very smaller.
>
> Could anyone please guide me how to reduce the execution time for
> generating frequent items?
>
> ------
> Thanks,
> Raju Bairishetti,
> www.lazada.com
>

Mime
View raw message