mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mario Levitin <>
Subject Re: Understanding LogLikelihood Similarity
Date Wed, 30 Apr 2014 21:31:14 GMT
Hi Ted,
I have read the paper. I understand the "Likelihood Ratio for Binomial
Distributions" part.
However, I cannot make a connection with this part and the contingency

In order to calculate Likelihood Ratio for two Binomial Distributions you
need the values: p, p1, p2, k1, k2, n1, n2.
But the information contained in the contingency table are different from
these values. So, again, I do not understand how the information contained
in the contingency table is linked with Likelihood Ratio for Binomial

In order to find the similarity between two users I tend to think of the
boolean preferences of user1 as a sample from a binomial distribution and
the boolean preferences of user2 as another sample from a binomial
distribution. Then use the LLR to assess how likely these distributions are
the same. But I don't think this is correct since this calculation does not
use the contingency table.

I hope my question is clear.

On Mon, Apr 28, 2014 at 2:41 AM, Ted Dunning <> wrote:

> Excellent.  Look forward to hearing your reactions.
> On Mon, Apr 28, 2014 at 1:14 AM, Mario Levitin <
> >wrote:
> > Not yet, but I will.
> >
> > >
> > > Have you read my original paper on the topic of LLR?  It explains the
> > > connection with chi^2 measures of association.
> >

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message