mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Reinis Vicups <>
Subject Mahout 1.0: is DRM too file-bound?
Date Thu, 09 Oct 2014 19:56:11 GMT

I am currently looking into the new (DRM) mahout framework.

I find myself wondering why is it so that from one side there is a lot
of thought, effort and design complexity being invested into abstracting
engines, contexts or algebraic operations,

but from the other side, even abstract interfaces, are defined in a way
that everything has to be read or written from files (on HDFS).

I am considering to implement reading/writing to NoSQL database and
initially I assumed it will be enough just to implement own
ReaderWriter, but I am currently realizing that I will have to
re-implement or hack-around by derivating own versions of large(?)
portions of framework including own variant of CheckpointedDrm,
DistributedEngine and what not.

Is it because abstracting away storage type would introduce even more
complexity or because there are aspects of design that absolutely
require to read/write only to (seq)files?

kind regards

View raw message