mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mustafa Elbehery <>
Subject TelephoneCall Logistic Regression Example
Date Tue, 19 May 2015 10:34:20 GMT
Hi Folks,

I have a question regarding the *TelephoneCall *in example package. We we
add load the training data from the CSV into the training matrix, we add a
weight for each feature-field in the feature vector.

In the TelephoneCall code, we add the weight with a *Log(v)*, logarithmic
value not the real value. I can not understand why ?!! Please find code
snippet below :-

case "age": {
  double v = Double.parseDouble(fieldValue);
  featureEncoder.addToVector(name, Math.log(v), vector);

However, in the balance field, we assign a negative value if less than
threshold, like this

case "balance": {
  double v;
  v = Double.parseDouble(fieldValue);
  if (v < -2000) {
    v = -2000;
  featureEncoder.addToVector(name, Math.log(v + 2001) - 8, vector);

Anyone can explain the logic, I am trying to run it on my own dataset,
and I am taking this example as a reference

Also I would like to know why we use a hashed vector, I can not get the
idea of that ?!!


Mustafa Elbehery
EIT ICT Labs Master School <>
skype: mustafaelbehery87

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message