mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitriy Lyubimov <>
Subject Re: How to SSVD output to generate Clusters
Date Wed, 31 Jul 2013 17:44:34 GMT
many people also use PCA options workflow with SSVD and then try clusterize
the output U*Sigma which is dimensionally reduced representation of
original row-wise dataset. To enable PCA and U*Sigma output, use

ssvd -pca true -us true -u false -v false -k=... -q=1 ...

-q=1 recommended for accuracy.

On Wed, Jul 31, 2013 at 5:09 AM, Stuti Awasthi <> wrote:

> Hi All,
> I wanted to group the documents with same context but which belongs to one
> single domain together. I have tried KMeans and LDA provided in Mahout to
> perform the clustering but the groups which are generated are not very
> good. Hence I thought to use LSA to indentify the context related to the
> word and then perform the Clustering.
> I am able to run SSVD of Mahout and generated 3 files : Sigma,U,V as
> output of SSVD.
> I am not sure how to use the output of SSVD to fed to the Clustering
> Algorithm so that we can generate the clusters of the documents which might
> be talking about same context.
> Any pointers how can I achieve this ?
> Regards
> Stuti Awasthi
> ----------------------------------------------------------------------------------------------------------------------------------------------------
> The contents of this e-mail and any attachment(s) are confidential and
> intended for the named recipient(s) only.
> E-mail transmission is not guaranteed to be secure or error-free as
> information could be intercepted, corrupted,
> lost, destroyed, arrive late or incomplete, or may contain viruses in
> transmission. The e mail and its contents
> (with or without referred errors) shall therefore not attach any liability
> on the originator or HCL or its affiliates.
> Views or opinions, if any, presented in this email are solely those of the
> author and may not necessarily reflect the
> views or opinions of HCL or its affiliates. Any form of reproduction,
> dissemination, copying, disclosure, modification,
> distribution and / or publication of this message without the prior
> written consent of authorized representative of
> HCL is strictly prohibited. If you have received this email in error
> please delete it and notify the sender immediately.
> Before opening any email and/or attachments, please check them for viruses
> and other defects.
> ----------------------------------------------------------------------------------------------------------------------------------------------------

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message