mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sebastian Schelter <...@apache.org>
Subject Re: Fwd: Another mahout ALS question
Date Sun, 06 Mar 2011 21:02:14 GMT
Hi Danny,

thanks for the nice writeup! I'm a little bit disappointed about the 
performance though...

Seems you got around those memory problems from last week without my 
patch, which is good, since I unfortunately didn't have the time to 
finish that one yet.




On 05.03.2011 01:33, Danny Bickson wrote:
> Hi Sebastian,
> As promised,  you can find some results for testing your ALS code, on 64
> high performance Amazon EC2 machines (with up to 1,024 cores).
> http://bickson.blogspot.com/2011/03/tunning-hadoop-configuration-for-high.html
>
> I would love to get any feedback you or others may have about the setup
> of this experiment.
>
> Best,
>
> Danny Bickson
>
> On Wed, Feb 23, 2011 at 4:41 PM, Sebastian Schelter <ssc@apache.org
> <mailto:ssc@apache.org>> wrote:
>
>     Hi Danny,
>
>     please send all mails to user@mahout.apache.org
>     <mailto:user@mahout.apache.org> instead of directly sending them to
>     me, there are a lot of smart people on that list that might join
>     with advice.
>
>     I'm very excited that you intensively test this code and I'm
>     positively suprised to see it give good results. Thank you for the
>     effort you put into that!
>
>     The exception seems to occur when ALSEvaluator is run. The code uses
>     a quick and dirty approach to compute the error of the model as it
>     just loads the user and item feature matrices completely into
>     memory. With an increasing number of features memory consumption is
>     getting too large.
>
>     The code of that evaluator step needs to be changed, so that each
>     (user,item) pair for which the rating shall be predicted gets joined
>     with the according user and item feature vectors in a way that they
>     are mapped to the same key and go to the same reducer which can then
>     compute the error.
>
>     I already started implementing something like this, but I don't have
>     a lot of time these days unfortunately. I could update the patch
>     during the next week if that's ok for you.
>
>     --sebastian
>
>
>
>
>     On 23.02.2011 21:57, Danny Bickson wrote:
>
>         Another exception I am getting:
>
>         11/02/23 20:45:34 INFO common.AbstractJob: Command line arguments:
>         {--endPhase=2147483647, --itemFeatures=/tmp/als/out/M/
>         , --probes=/user/ubuntu/myout/probeSet/, --startPhase=0,
>         --tempDir=temp,
>         --userFeatures=/tmp/als/out/U/}
>         Exception in thread "main" java.lang.OutOfMemoryError: Java heap
>         space
>                 at
>         org.apache.mahout.math.map.OpenIntDoubleHashMap.rehash(OpenIntDoubleHashMap.java:433)
>                 at
>         org.apache.mahout.math.map.OpenIntDoubleHashMap.put(OpenIntDoubleHashMap.java:387)
>                 at
>         org.apache.mahout.math.RandomAccessSparseVector.setQuick(RandomAccessSparseVector.java:134)
>                 at
>         org.apache.mahout.math.VectorWritable.readFields(VectorWritable.java:113)
>                 at
>         org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java:1751)
>                 at
>         org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1879)
>                 at
>         org.apache.mahout.utils.eval.ALSEvaluator.readMatrix(ALSEvaluator.java:113)
>                 at
>         org.apache.mahout.utils.eval.ALSEvaluator.run(ALSEvaluator.java:71)
>                 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>                 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>                 at
>         org.apache.mahout.utils.eval.ALSEvaluator.main(ALSEvaluator.java:52)
>                 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
>         Method)
>                 at
>         sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>                 at
>         sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>                 at java.lang.reflect.Method.invoke(Method.java:616)
>                 at
>         org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
>                 at
>         org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>                 at
>         org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:174)
>                 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
>         Method)
>                 at
>         sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>                 at
>         sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>                 at java.lang.reflect.Method.invoke(Method.java:616)
>                 at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>
>         THANKS!
>         ---------- Forwarded message ----------
>         From: *Danny Bickson* <danny.bickson@gmail.com
>         <mailto:danny.bickson@gmail.com>
>         <mailto:danny.bickson@gmail.com <mailto:danny.bickson@gmail.com>>>
>         Date: Wed, Feb 23, 2011 at 3:05 PM
>         Subject: Another mahout ALS question
>         To: ssc@apache.org <mailto:ssc@apache.org>
>         <mailto:ssc@apache.org <mailto:ssc@apache.org>>
>
>
>         Hi!
>         I successfully run 10 iterations for your ALS code, with D=20,
>         lambda=0.065 and I get a very impressive RMSE of 0.93
>         However, when I try to increase D, I get various out of memory
>         errors,
>         even with small netflix subsample of 3M values.
>
>         One of the errors I am getting is in the evaluateALS step:
>         11/02/23 19:04:11 WARN driver.MahoutDriver: No evaluateALS.props
>         found
>         on classpath, will use command-line arguments only
>         11/02/23 19:04:12 INFO common.AbstractJob: Command line arguments:
>         {--endPhase=2147483647, --itemFeatures=/tmp/als/out/M/,
>         --probes=/user/ubuntu/myout/probeSet/, --startPhase=0,
>         --tempDir=temp,
>         --userFeatures=/tmp/als/out/U/}
>         Exception in thread "main" java.lang.OutOfMemoryError: GC
>         overhead limit
>         exceeded
>                  at
>         org.apache.mahout.math.map.OpenIntDoubleHashMap.rehash(OpenIntDoubleHashMap.java:433)
>                  at
>         org.apache.mahout.math.map.OpenIntDoubleHashMap.put(OpenIntDoubleHashMap.java:387)
>                  at
>         org.apache.mahout.math.RandomAccessSparseVector.setQuick(RandomAccessSparseVector.java:134)
>                  at
>         org.apache.mahout.math.VectorWritable.readFields(VectorWritable.java:113)
>                  at
>         org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java:1751)
>                  at
>         org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1879)
>                  at
>         org.apache.mahout.utils.eval.ALSEvaluator.readMatrix(ALSEvaluator.java:113)
>                  at
>         org.apache.mahout.utils.eval.ALSEvaluator.run(ALSEvaluator.java:71)
>                  at
>         org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>                  at
>         org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>                  at
>         org.apache.mahout.utils.eval.ALSEvaluator.main(ALSEvaluator.java:52)
>                  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
>         Method)
>                  at
>         sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>                  at
>         sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>                  at java.lang.reflect.Method.invoke(Method.java:616)
>                  at
>         org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
>                  at
>         org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>                  at
>         org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:174)
>                  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
>         Method)
>                  at
>         sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>                  at
>         sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>                  at java.lang.reflect.Method.invoke(Method.java:616)
>                  at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>
>
>         There is no related exception in the Hadoop logs.
>
>         I am running with java child opts of -Xmx2048M.
>
>         Do you have any tips for me? Do you want me to post this into the
>         Mahout-542 newsgroup?
>
>         thanks,
>
>
>         DB
>
>
>


Mime
View raw message