mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stanley Xu <wenhao...@gmail.com>
Subject Re: how to get input in parallel FPGrowth
Date Fri, 06 May 2011 03:16:09 GMT
The PFP in mahout accept a text input format, you could specify the splitter
to split different columns. For other data source, the easiest way is to
transfer it to a text format and separate the columns by a tab('\t') and put
it into the hdfs as the PFP input.


On Fri, May 6, 2011 at 9:35 AM, hustnn <nzjemail@gmail.com> wrote:

> I see a topic of you about "the convert data in databases (Flatfiles,
> XMLdumps, MySQL,Cassandra, Different formats on  HDFS, Hbase) into
> intermediate form(say vector)".
>
> I Know the parallel FPGrowth can use the hadoop to distribute compution in
> different tasktrackers easily in map-reduce ways, but I want to know how
> parallel FPGrowth works using other database such as mysql, cassandra and
> hbase. How does it gain input and how does it distribute computions making
> it works parallelly.
>
> Thanks.
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/how-to-get-input-in-parallel-FPGrowth-tp2906536p2906536.html
> Sent from the Mahout Developer List mailing list archive at Nabble.com.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message