mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: Reg:-Integrating Mahout with Solr
Date Sun, 02 Apr 2017 06:55:21 GMT
Hundreds of users are going to generate a really, really tiny amount of
data (relative to the normal amounts that recommenders get to see).

The problem is that hundreds of hyper-active users who issue thousands of
queries are only going to generate a tiny amount of data per document. You
will need to have roughly 20 positive interactions per document to get
decent performance. If you have a thousand documents, that means you will
need an absolutely (and implausible) 20 thousand engagements. Because the
distribution will be very lop-sided, you probably need 10-100x more than
that.

The final result is your hundreds of users would likely need to issue
thousands of queries. Each.

That seems like a lot.

You should get good results for a small minority of documents at smaller
data volumes.




On Sat, Apr 1, 2017 at 11:37 PM, arun abraham <arunabraham100@gmail.com>
wrote:

> Hi Ted,
>
> Each documents to be indexed by Solr has  fairly large content in it and
> 100+ users searching within it(once the solr search tool goes live).
> Kindly guide me on the integration steps for mahout with Solr(with respect
> all the stats mentioned).
>
> Thanks and Regards,
> Arun
>
> On 2 April 2017 at 11:59, Ted Dunning <ted.dunning@gmail.com> wrote:
>
> > Arun,
> >
> > That's good news.
> >
> > The second limitation will be how much data you have for each document
> and
> > whehter you have a good measure of how engaged users are with documents.
> >
> >
> >
> > On Sat, Apr 1, 2017 at 6:48 PM, arun abraham <arunabraham100@gmail.com>
> > wrote:
> >
> > > Hi Ted,
> > >
> > > Thanks for the reply.
> > >
> > > I understood Ted,to have  a good effective results a larger set of
> > > documents/index is required.
> > >
> > > For all the Solr related functionalities and Search,I used ~100
> docs(path
> > > pointing to my local system) to index and set things up.This is only
> for
> > > testing and implementing.
> > >
> > > Once the configuration and high level testing is done the configuration
> > > will be changed in such way the document path will be pointing to the
> LAN
> > > location where we have  a large collection of documents for indexing
> and
> > > high level testing is done.
> > >
> > > It wont be a problem for me to use the LAN path for configurations and
> > > index.I can use the larger document base.
> > >
> > > Thanks and Regards,
> > > Arun
> > >
> > > On 2 April 2017 at 07:00, Ted Dunning <ted.dunning@gmail.com> wrote:
> > >
> > > > On Sat, Apr 1, 2017 at 6:21 PM, arun abraham <
> arunabraham100@gmail.com
> > >
> > > > wrote:
> > > >
> > > > > As  a first step I am trying to recommend min of two documents(As
> my
> > > > > Solr document index is ~100 docs).
> > > > >
> > > >
> > > > This is kind of weird.
> > > >
> > > > Can you say why you have so very few documents?
> > > >
> > > > There may be something special going on that will make this work
> better
> > > or
> > > > worse.
> > > >
> > > > I have seen people use indicator-based recommendations for ad
> targeting
> > > > where they had several thousand options, but haven't seen anything
> with
> > > > only 100 options.
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message