mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <>
Subject Interesting MapReduce variant: MapFreeduce
Date Sun, 15 May 2011 17:09:31 GMT
Hi all, in my travels I've come across a small interesting startup that I
thought might be of interest to the user@ audience. It's MapFreeduce (, and they're spinning an interesting twist on
MapReduce. They've constructed a simplified MapReduce API, one for which
workers are able to run as Java applets in the browser sandbox.

It's interesting for two reasons, I can tell you, after playing with it
myself. One, I think it's interesting as it asks whether a simpler version
of MapReduce than what you get in Hadoop is viable. That is -- it's not
Hadoop. Can you do something interesting without, say, direct access to
HDFS? Combiners? custom InputFormats? And two, since it can fairly
automatically turn office PCs with a browser into a safe background MR
worker, might let organizational skunk-works create a cluster for cheap out
of truly unused cycles to do something interesting.

I managed to reconstruct parts of the recommender pipeline on this framework
without too much modification. It is possible to 'port' some parts of Mahout
to this framework, if not all. MapReduce fans will probably enjoy taking a
look at what they can get away with in a browser sandbox.

>From a conversation with their founder I know they'd really like feedback
and testers. Here's their pitch and plea for beta users in their own words.
(I have no affiliation with or interest in the company.)

*" is a Washington DC-based startup making Big Data
accessible to everyone. Our software service enables users to quickly and
easily build a mapreduce cluster from the spare CPU-cycles of available
computers without installing or configuring any software. To add a node to
your MapFreeduce cluster and increase its power, you simply click on a link
from any idle computer. You can scale your cluster to thousands of nodes to
perform computation- and data-intensive tasks such as web indexing, data
mining, business analytics, data warehousing, machine learning, financial
analysis, scientific simulation, and bioinformatics research. MapFreeduce
allows you to focus on crunching your data without having to worry about
either the cost and complexity of setting up a traditional hardware cluster
or the perpetual fees charged per hour and per node by common cloud

We are looking for individuals that would be interested in joining our free,
private beta test and/or providing feedback to our service."*

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message