spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ron Gonzalez <zlgonza...@yahoo.com.INVALID>
Subject Re: Classifier for Big Data Mining
Date Tue, 21 Jul 2015 23:20:09 GMT
I'd use Random Forest. It will give you better generalizability. There 
are also a number of things you can do with RF that allows to train on 
samples of the massive data set and then just average over the resulting 
models...

Thanks,
Ron

On 07/21/2015 02:17 PM, Olivier Girardot wrote:
> depends on your data and I guess the time/performance goals you have 
> for both training/prediction, but for a quick answer : yes :)
>
> 2015-07-21 11:22 GMT+02:00 Chintan Bhatt 
> <chintanbhatt.ce@charusat.ac.in <mailto:chintanbhatt.ce@charusat.ac.in>>:
>
>     Which classifier can be useful for mining massive datasets in spark?
>     Decision Tree can be good choice as per scalability?
>
>     -- 
>     CHINTAN BHATT <http://in.linkedin.com/pub/chintan-bhatt/22/b31/336/>
>     Assistant Professor,
>     U & P U Patel Department of Computer Engineering,
>     Chandubhai S. Patel Institute of Technology,
>     Charotar University of Science And Technology (CHARUSAT),
>     Changa-388421, Gujarat, INDIA.
>     http://www.charusat.ac.in <http://www.charusat.ac.in/>
>     _Personal Website_: https://sites.google.com/a/ecchanga.ac.in/chintan/
>
>


Mime
View raw message