mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Musselman <>
Subject Re: PFPGrowth fList maximum available capacity
Date Thu, 18 Apr 2019 23:29:26 GMT
Hi Kyoung, I took a look at that package; I've never used it but there is
an example which uses it:

$ pwd
$ ls

You may want to try that and see if the input data looks anything like what
you're planning to use it for in terms of list sizes, otherwise you could
try running it and seeing what happens.

Please let us know if you run into issues. It would be great to move that
package off map-reduce and into the newer framework; if you wanted to take
a look at that I'm sure we could give you some pointers along the way.


On Wed, Apr 17, 2019 at 5:29 AM Kyoung Deok Kwon <> wrote:

> Hello. mahout members!
> First of all, please understand that I am not good at English.
> I am going to use mahout PFPGrowth for my project.
> As I understand it, parallel counting through Hadoop MapReduce in function
> `startParallelCounting()` and then start grouping in function
> `readFList()`.
> In function `readFList()` , fList is declared Lists.newArrayList().
> There are expected to be hundreds of millions of fList sizes. Can I use
> them?
> Stack overflow is expected. Is it not designed in case size fList is large?
> Am I right to understand?
> Thanks in advance.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message