spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Charles Li <lee.apache.sp...@gmail.com>
Subject Need some instruction on gbdt
Date Tue, 08 Mar 2016 11:56:47 GMT
Hi guys,

I am training a gbdt (100 trees with depth 7 each). Some interesting things
happened.
First the storage looks like:
[image: Inline image 1]

At the very beginning stage, the count looks like:
[image: Inline image 2]
Then the count step becomes slower because of more input data, like:
[image: Inline image 3]

The input in the stage becomes 980G !

I am using spark 1.4.1. Any one has any idea on why the input becomes
bigger after every iteration ?

Mime
View raw message