mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stuti Awasthi <>
Subject Random Forest Implementation training is too slow for 2 GB of data
Date Fri, 11 Jul 2014 10:53:16 GMT
Hi all,

I have some 2 GB of data and tried to  execute RF with no of trees = 10 and maxsplitsize as
90 MB. The execution takes too much time.
I have also tried with #of trees =2, then it takes less time but gives accuracy <50%
If I use less data with greater no of trees , then output accuracy is >90%

Is there any tuning to execute it quickly with optimal no of trees for accuracy > 80%.

Please suggest

Stuti Awasthi


The contents of this e-mail and any attachment(s) are confidential and intended for the named
recipient(s) only.
E-mail transmission is not guaranteed to be secure or error-free as information could be intercepted,
lost, destroyed, arrive late or incomplete, or may contain viruses in transmission. The e
mail and its contents
(with or without referred errors) shall therefore not attach any liability on the originator
or HCL or its affiliates.
Views or opinions, if any, presented in this email are solely those of the author and may
not necessarily reflect the
views or opinions of HCL or its affiliates. Any form of reproduction, dissemination, copying,
disclosure, modification,
distribution and / or publication of this message without the prior written consent of authorized
representative of
HCL is strictly prohibited. If you have received this email in error please delete it and
notify the sender immediately.
Before opening any email and/or attachments, please check them for viruses and other defects.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message