mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lee, Howon" <ho...@thebackplane.com>
Subject Re: churn analysis
Date Thu, 25 Jul 2013 17:49:22 GMT
On that subject, does anyone have any resources re: feature engineering for
churn analysis?


On Thu, Jul 25, 2013 at 4:12 AM, Sayed Seliman <sayed.seliman@gr-ci.com>wrote:

> Hi,
>
> mahout is a customer requirement.
> Can I use the logistic regression with Mahout ?
> How I have to prepare my data to be processed with the logistic regression
> ?
>
> Thanks
>
>
> -----Messaggio originale-----
> Da: Fernando Fernández [mailto:fernando.fernandez.gonzalez@gmail.com]
> Inviato: giovedì 25 luglio 2013 07:20
> A: user@mahout.apache.org
> Oggetto: Re: churn analysis
>
> If you don't know where to start, I would recommend starting with something
> more conventional than HMM that can be tricky to fully understand and
> explain. A logistic regression model can perform very well if predictors
> are
> built with care. I wouldn't start also with mahout unless this is a
> requirement from a client (some clients are so thrilled about "big data"
> that they want to use mahout even if it's overkill for most predictive
> analytics tasks...), You will probably not need more than 100k-200k records
> to build a pretty good model, an undersampling scheme can also be good for
> the model (not necessary, but it won't hurt) and lead you that sample size
> anyway.
>
> If you need to go for mahout, there is an SGD implementation for logistic
> regression in mahout.
>
> The key point for building a good churn model though is in how you build
> predictor variables, then any binary classification model would do the
> trick.
>
>
> 2013/7/24 <simon.2.thompson@bt.com>
>
> > I've not used Mahout to do it, but in the past colleagues have used
> > HMM to create a way for discovering customers who are in an "about to
> churn"
> > state, this was used to populate a target list for winback
> > intervention (they're about to curn, contact them and offer something
> > - or just help - to keep them). I tried the Mahout HMM earlier in the
> > year, but got discouraged by some odd behaviour which I have still not
> > managed to delve into.
> >
> > The problem that we saw with churn analysis for our domain was that
> > most churners leave with no event on their account in the recent past.
> > Essentially there are external factors that are generating churn over
> > the whole population (competitor offers, demographics, economics)
> > which mean that the domain model is not accessible from the data. So,
> > while a much better than "random" predictor can be built it only
> > barely costed in to operate, and is sufficiently far from a conclusive
> > knockdown winner to allow homebrew.spreadsheet.witchcraft alternatives
> > to pop up and be given air time by people not familiar with the idea
> > that if you flip 1000 coins in the air at once some of them are going
> > to keep coming up as heads for a bit. One way round this is "more
> > data, better data" which is kinda where I came in on for Mahout and
> HMM's.
> >
> > So, my suggestion would be :
> >
> > - look at your data; do your churners have events in an actionable
> > period (this depends on your domain) that could be the basis of a
> > signal? If there are enough of them in this category to power a
> > business case based on intervention and win back you're on... if not
> > then more data, better data is needed..
> > - if there are strong correlations between the last event and the churn?
> > Then use a decision tree or similar to classify churn prospects from
> > stables - if you get a good predictor no need to do more, if not then..
> > - try a HMM, it could help you find groups of sequences of action that
> > lead to churning (repeated contacts, escalations, resorting to letter
> > writing etc.) But check that Mahouts one is sound and works for you (I
> > am not confident that I did enough work to say that my problems
> > weren't a case of "problem between screen and chair" so if you get
> > things working then
> > superduper!)
> >
> > Hope that helps you,
> >
> > Simon
> >
> >
> >
> > ________________________________________
> > From: Sayed Seliman [sayed.seliman@gr-ci.com]
> > Sent: 24 July 2013 21:37
> > To: user@mahout.apache.org
> > Subject: churn analysis
> >
> > Hi,
> >
> >
> >
> > what are your experiences in building churn analysis system with mahout ?
> >
> > What do you suggest to implement ?
> >
> > Any success story implementing churn analysis system with mahout ?
> >
> >
> >
> > thanks
> >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message