mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stanley Xu <>
Subject Re: how to get input in parallel FPGrowth
Date Fri, 06 May 2011 03:16:09 GMT
The PFP in mahout accept a text input format, you could specify the splitter
to split different columns. For other data source, the easiest way is to
transfer it to a text format and separate the columns by a tab('\t') and put
it into the hdfs as the PFP input.

On Fri, May 6, 2011 at 9:35 AM, hustnn <> wrote:

> I see a topic of you about "the convert data in databases (Flatfiles,
> XMLdumps, MySQL,Cassandra, Different formats on  HDFS, Hbase) into
> intermediate form(say vector)".
> I Know the parallel FPGrowth can use the hadoop to distribute compution in
> different tasktrackers easily in map-reduce ways, but I want to know how
> parallel FPGrowth works using other database such as mysql, cassandra and
> hbase. How does it gain input and how does it distribute computions making
> it works parallelly.
> Thanks.
> --
> View this message in context:
> Sent from the Mahout Developer List mailing list archive at

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message