mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arshad Khan <khan.m.ars...@gmail.com>
Subject Re: Empty cluster exception
Date Wed, 07 Apr 2010 09:02:18 GMT
Let me do some more investigation. It could be an issue with my code.

On Wed, Apr 7, 2010 at 9:44 AM, Robin Anil <robin.anil@gmail.com> wrote:

> Could you upload the dataset(if its small) somewhere. I will take a look at
> it.
>
> Robin
>
> On Wed, Apr 7, 2010 at 7:11 AM, Arshad Khan <khan.m.arshad@gmail.com>
> wrote:
>
> > It seems that the empty cluster exception is being caused by another
> > exception happening earlier. It is the FileAlreadyExistsException. The
> > stack
> > trace is follows. Although I am using HadoopUtil.overwrite method to
> > cleanup
> > the output dir, but the exception happens anyway.
> >
> >
> > org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory
> >
> >
> file:/informatics/data/scratch/TMA/work/C104614-2010-04-06-21-29-57-CDE00FB3-2C58-4C89-AAC4-8E79083D9D12/clusters/clusters-0
> > already exists
> >    at
> >
> >
> org.apache.hadoop.mapred.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:111)
> >    at
> > org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:772)
> >    at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:730)
> >    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1249)
> >    at
> >
> >
> org.apache.mahout.clustering.kmeans.KMeansDriver.runIteration(KMeansDriver.java:270)
> >    at
> >
> >
> org.apache.mahout.clustering.kmeans.KMeansDriver.runJob(KMeansDriver.java:213)
> >
> > Again, this happens randomly.
> >
> > On Thu, Apr 1, 2010 at 9:28 AM, Arshad Khan <khan.m.arshad@gmail.com>
> > wrote:
> >
> > > The data being used for clustering is coming out of an index created on
> a
> > > bunch of PubMed abstracts. The index is passed through a TFDFMapper
> using
> > > the tf-idf weighting scheme and a points file is generated using the
> > > LuceneIterable class. This file is the input file to the KMeansDriver
> > > program. The code to perform this is actually same as one given in the
> > > util.vectors.lucene.Driver class.
> > >
> > > Arshad
> > >
> > >
> > > On Thu, Apr 1, 2010 at 1:55 AM, Ted Dunning <ted.dunning@gmail.com>
> > wrote:
> > >
> > >> Empty clusters are not that uncommon with k-means if you specify too
> > large
> > >> a
> > >> value for k.
> > >>
> > >> Arshad,  can you say more about what data you are clustering?
> > >>
> > >> On Wed, Mar 31, 2010 at 6:29 AM, Grant Ingersoll <gsingers@apache.org
> > >> >wrote:
> > >>
> > >> > Can you share the parameters you used to get this?  Does it happen
> > every
> > >> > time?
> > >> >
> > >> >
> > >> > On Mar 29, 2010, at 11:53 PM, Arshad Khan wrote:
> > >> >
> > >> > > Hello All
> > >> > >
> > >> > > While using Mahout 0.3 KMeansDriver I am encountering an exception
> > >> > > indicating an empty cluster. This happens sometimes while
> re-running
> > >> the
> > >> > > clustering on the same data set. Is there a way to prevent this
> > error?
> > >> > The
> > >> > > exception trace is follows:
> > >> > >
> > >> > > java.lang.RuntimeException: Error in configuring object
> > >> > >        at
> > >> > >
> > >> >
> > >>
> >
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
> > >> > >        at
> > >> > >
> > >>
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
> > >> > >        at
> > >> > >
> > >> >
> > >>
> >
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
> > >> > >        at
> > >> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:354)
> > >> > >        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
> > >> > >        at
> > >> > >
> > >>
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:176)
> > >> > > Caused by: java.lang.reflect.InvocationTargetException
> > >> > >        at sun.reflect.GeneratedMethodAccessor39.invoke(Unknown
> > Source)
> > >> > >        at
> > >> > >
> > >> >
> > >>
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > >> > >        at java.lang.reflect.Method.invoke(Method.java:597)
> > >> > >        at
> > >> > >
> > >> >
> > >>
> >
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
> > >> > >        ... 5 more
> > >> > > Caused by: java.lang.RuntimeException: Error in configuring object
> > >> > >        at
> > >> > >
> > >> >
> > >>
> >
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
> > >> > >        at
> > >> > >
> > >>
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
> > >> > >        at
> > >> > >
> > >> >
> > >>
> >
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
> > >> > >        at
> > >> org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
> > >> > >        ... 9 more
> > >> > > Caused by: java.lang.reflect.InvocationTargetException
> > >> > >        at sun.reflect.GeneratedMethodAccessor39.invoke(Unknown
> > Source)
> > >> > >        at
> > >> > >
> > >> >
> > >>
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > >> > >        at java.lang.reflect.Method.invoke(Method.java:597)
> > >> > >        at
> > >> > >
> > >> >
> > >>
> >
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
> > >> > >        ... 12 more
> > >> > > Caused by: java.lang.IllegalStateException: Cluster is empty!
> > >> > >        at
> > >> > >
> > >> >
> > >>
> >
> org.apache.mahout.clustering.kmeans.KMeansClusterMapper.configure(KMeansClusterMapper.java:73)
> > >> > >        ... 16 more
> > >> > >
> > >> > > Thanks
> > >> > > Arshad
> > >> >
> > >> > --------------------------
> > >> > Grant Ingersoll
> > >> > http://www.lucidimagination.com/
> > >> >
> > >> > Search the Lucene ecosystem using Solr/Lucene:
> > >> > http://www.lucidimagination.com/search
> > >> >
> > >> >
> > >>
> > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message