mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <sro...@gmail.com>
Subject Re: Mahout performance issues
Date Thu, 01 Dec 2011 15:12:08 GMT
You can 'tickle' the cache asynchronously if you like.

I am still not clear on why you are doing so many item-item similarity
calculations. Your change ought to let you do 1, or 10, or 100 per
calculation if you like. That, we know, is fast. And a few hundred
similarities should start to give reasonable recommendations.

What is preventing you from making this tradeoff (with your change)?
Yes, it is essential for reasonable performance.

On Thu, Dec 1, 2011 at 3:06 PM, Daniel Zohar <dissoman@gmail.com> wrote:

> Hi Manuel,
> I haven't got to the point where CacheItemSimilarity kicks in. That is, I
> will have to run a lot of recommendations in order to get a real benefit
> from it. I would first like to optimize the 'cold start' so it's at least
> serves at reasonable time. Usually cache is used to prevent repeated
> calculations, but personally I dont think it's a replacement for optimized
> performance. Don't you agree?
>
> Also, I will try to profile the app now as you suggest and send the results
> asap.
>
> Thanks!
>
> On Thu, Dec 1, 2011 at 4:56 PM, Manuel Blechschmidt <
> Manuel.Blechschmidt@gmx.de> wrote:
>
> > Hi Daniel,
> > actually you are running the profile inside tomcat. You should take a
> > snapshot and then drill down to the functions where the actual
> > recommendation takes place. The current screenshots also contains some
> > profiles from Tomcat threads which are sleeping a lot and therefore
> taking
> > a lot of time.
> >
> > Further the screenshots does not contain the amount how often the
> > different functions are called.
> >
> > You have to profile multiple requests alone. The CacheItemSimilarity gets
> > filled therefore it should go faster and faster.
> >
> > On 01.12.2011, at 15:11, Daniel Zohar wrote:
> >
> > > @Manuel thanks for the tips. I have installed VisualVM and followed are
> > the
> > > results
> > > I did two sampling -
> > > - With the optimized SamplingCandidateItemsStrategy (
> > > http://pastebin.com/6n9C8Pw1):
> http://static.inky.ws/image/934/image.jpg
> > > - Without the optimized SamplingCandidateItemsStrategy:
> > > http://static.inky.ws/image/935/image.jpg
> > >
> >
> > The big hot spot is the function FastIDSet.find():
> >
> > Optimized: 13,759 s
> > Unoptimized: 246,487 s
> >
> > So you see that your optimization already got you a performance boost of
> > 2000%.
> >
> > Did you play around with the CacheItemSimilarity cache sizes?
> >
> > /Manuel
> >
> > --
> > Manuel Blechschmidt
> > Dortustr. 57
> > 14467 Potsdam
> > Mobil: 0173/6322621
> > Twitter: http://twitter.com/Manuel_B
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message