mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Musselman <andrew.mussel...@gmail.com>
Subject Re: Mahout Collaborative Filtering using a parallel matrix factorization
Date Sun, 27 Oct 2013 21:58:58 GMT
No empty fields or extra empty lines, no extra whitespace?

> On Oct 27, 2013, at 2:39 PM, "barnold4238@gmail.com" <barnold4238@gmail.com> wrote:
> 
> Yea I have, and the job actually succeeded. But, in looking through my data, I still
can't find anything that would look like it would cause the issue. Is there anything with
the map side joins, and the size of my data that would potentially cause the issue? Is there
a hadoop setting I would need to insure is set to a particular level so the job can succeed?
> 
> Thanks!
> Brian
>> On Oct 27, 2013, at 3:53 PM, Sebastian Schelter <ssc.open@googlemail.com> wrote:
>> 
>> Hi Brian,
>> 
>> That error looks strange, could you try to run it on a toy dataset and
>> see if you get the same error?
>> 
>> --sebastian
>> 
>>> On 25.10.2013 22:29, Brian Arnold wrote:
>>> Hey Everyone,
>>> 
>>> I was hoping that someone could help me out with the
>>> ParallelALSFactoirzationJob that I am trying to run.  I have been trying to
>>> run this over a 27GB dataset of customer transaction data, and the job
>>> keeps failing with a null pointer exception. I am running with Mahout 0.8
>>> and the following parameters --lambda 0.05 --implicitFeedback true
>>> --numFeatures 20 --numIterations 1 --tempDir temp/mahout_als
>>> --numThreadsPerSolver 1.
>>> ParallelALSFactorizationJob-ItemRatingVectorsMapper-Reducer completes fine,
>>> ParallelALSFactorizationJob-TransposeMapper-Reducer completes fine,
>>> ParallelALSFactorizationJob-AverageRatingMapper-Reducer completes fine, but
>>> it fails on Recompute U, iteration (2/1), (1 threads, 5 features, implicit
>>> feedback).
>>> 
>>> 
>>> Here is the stacktrace I am receiving:
>>> 
>>> java.lang.RuntimeException: java.lang.NullPointerException
>>>   at org.apache.hadoop.mapreduce.lib.map.MultithreadedMapper.run(MultithreadedMapper.java:149)
>>>   at org.apache.mahout.cf.taste.hadoop.als.MultithreadedSharingMapper.run(MultithreadedSharingMapper.java:53)
>>>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
>>>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:363)
>>>   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>>   at java.security.AccessController.doPrivileged(Native Method)
>>>   at javax.security.auth.Subject.doAs(Subject.java:396)
>>>   at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
>>>   at org.apache.hadoop.mapred.Child.main(Child.java:249)
>>> Caused by: java.lang.NullPointerException
>>>   at org.apache.mahout.math.als.ImplicitFeedbackAlternatingLeastSquaresSolver.getYtransponseCuMinusIYPlusLambdaI(ImplicitFeedbackAlternatingLeastSquaresSolver.java:95)
>>>   at org.apache.mahout.math.als.ImplicitFeedbackAlternatingLeastSquaresSolver.solve(ImplicitFeedbackAlternatingLeastSquaresSolver.java:51)
>>>   at org.apache.mahout.cf.taste.hadoop.als.SolveImplicitFeedbackMapper.map(SolveImplicitFeedbackMapper.java:54)
>>>   at org.apache.mahout.cf.taste.hadoop.als.SolveImplicitFeedbackMapper.map(SolveImplicitFeedbackMapper.java:29)
>>>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
>>>   at org.apache.hadoop.mapreduce.lib.map.MultithreadedMapper$MapRunner.run(MultithreadedMapper.java:268)
>>> 
>>> 
>>> Thanks so much!
>>> 
>>> Brian
> 
> 
>> On Oct 27, 2013, at 3:53 PM, Sebastian Schelter <ssc.open@googlemail.com> wrote:
>> 
>> Hi Brian,
>> 
>> That error looks strange, could you try to run it on a toy dataset and
>> see if you get the same error?
>> 
>> --sebastian
>> 
>>> On 25.10.2013 22:29, Brian Arnold wrote:
>>> Hey Everyone,
>>> 
>>> I was hoping that someone could help me out with the
>>> ParallelALSFactoirzationJob that I am trying to run.  I have been trying to
>>> run this over a 27GB dataset of customer transaction data, and the job
>>> keeps failing with a null pointer exception. I am running with Mahout 0.8
>>> and the following parameters --lambda 0.05 --implicitFeedback true
>>> --numFeatures 20 --numIterations 1 --tempDir temp/mahout_als
>>> --numThreadsPerSolver 1.
>>> ParallelALSFactorizationJob-ItemRatingVectorsMapper-Reducer completes fine,
>>> ParallelALSFactorizationJob-TransposeMapper-Reducer completes fine,
>>> ParallelALSFactorizationJob-AverageRatingMapper-Reducer completes fine, but
>>> it fails on Recompute U, iteration (2/1), (1 threads, 5 features, implicit
>>> feedback).
>>> 
>>> 
>>> Here is the stacktrace I am receiving:
>>> 
>>> java.lang.RuntimeException: java.lang.NullPointerException
>>>   at org.apache.hadoop.mapreduce.lib.map.MultithreadedMapper.run(MultithreadedMapper.java:149)
>>>   at org.apache.mahout.cf.taste.hadoop.als.MultithreadedSharingMapper.run(MultithreadedSharingMapper.java:53)
>>>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
>>>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:363)
>>>   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>>   at java.security.AccessController.doPrivileged(Native Method)
>>>   at javax.security.auth.Subject.doAs(Subject.java:396)
>>>   at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
>>>   at org.apache.hadoop.mapred.Child.main(Child.java:249)
>>> Caused by: java.lang.NullPointerException
>>>   at org.apache.mahout.math.als.ImplicitFeedbackAlternatingLeastSquaresSolver.getYtransponseCuMinusIYPlusLambdaI(ImplicitFeedbackAlternatingLeastSquaresSolver.java:95)
>>>   at org.apache.mahout.math.als.ImplicitFeedbackAlternatingLeastSquaresSolver.solve(ImplicitFeedbackAlternatingLeastSquaresSolver.java:51)
>>>   at org.apache.mahout.cf.taste.hadoop.als.SolveImplicitFeedbackMapper.map(SolveImplicitFeedbackMapper.java:54)
>>>   at org.apache.mahout.cf.taste.hadoop.als.SolveImplicitFeedbackMapper.map(SolveImplicitFeedbackMapper.java:29)
>>>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
>>>   at org.apache.hadoop.mapreduce.lib.map.MultithreadedMapper$MapRunner.run(MultithreadedMapper.java:268)
>>> 
>>> 
>>> Thanks so much!
>>> 
>>> Brian
>> 

Mime
View raw message