mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Delroy Cameron <>
Subject Dirichlet ClusterDump Output
Date Tue, 04 May 2010 23:54:39 GMT

so i've run Dirichlet Clustering using Mahout and i'm trying to see the
clusterdump. Of course i'm using a combination of ClusterDumper,
DirichletOutputState and DirichletCluster and TestL1ModelClustering to help
with the output.

so far i've successfully read each file in each state-x output folder. The
issue is that the vectors appear to be serialized as <Text,
DirichletCluster> pairs in each binary dump, which is fine. However, after
debugging it turns out that the model for each DirichletCluster is
null....and this make sense, since i'm reading from the dump file as

SequenceFile.Reader  reader = new SequenceFile.Reader(fileSystem, inputPath,
Text key = (Text) reader.getKeyClass().newInstance();
DirichletCluster cluster = (DirichletCluster)

i tried to set the fields for the DirichletCluster by using the following
method readFields(DataInput in);
DataInput istream = new DataInputStream(new FileInputStream(new

and i have a null pointer exception...

can i have a few suggestion on how to proceed here...

View this message in context:
Sent from the Mahout User List mailing list archive at

View raw message