Hi all!,
I have a problem with LogisticRegressionWithSGD, when I train a data set
with one variable (wich is a amount of an item) and intercept, I get weights
of
(0.4021,207.1749) for both features, respectively. This don´t make sense
to me because I run a logistic regression for the same data in SAS and I get
these weights (2.6604,0.000245).
The rank of this variable is from 0 to 59102 with a mean of 1158.
The problem is when I want to calculate the probabilities for each user from
data set, this probability is near to zero or zero in much cases, because
when spark calculates exp(1*(0.4021+(207.1749)*amount)) this is a big
number, in fact infinity for spark.
How can I treat this variable? or why this happened?
Thanks ,
Franco Barrientos
Data Scientist
Málaga #115, Of. 1003, Las Condes.
Santiago, Chile.
(+562)29699649
(+569)76347893
<mailto:franco.barrientos@exalitica.com> franco.barrientos@exalitica.com
<http://www.exalitica.com/> www.exalitica.com
<http://exalitica.com/web/img/frim.png>
