hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gregory Farnum (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-6253) Add a Ceph FileSystem interface.
Date Fri, 11 Sep 2009 20:24:57 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-6253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Gregory Farnum updated HADOOP-6253:

    Status: Patch Available  (was: Open)

I've attached a patch which includes the CephFileSystem and IOStream classes, as well as package
documentation. To actually use it you're going to need an installation of Ceph (ceph.newdream.net).
I have *not* included any unit tests, as the code depends on the libhadoopceph shared library
and without a Ceph install it seems sort of pointless -- about all I can see to do is make
sure that calling the methods throws an IOException for being uninitialized. Still, most of
the other filesystems came up with something, so if you have any suggestions for useful test
cases let me know and I can add them. :)

In very basic testing (~900MB and ~6GB worth of data), this and the current Ceph code is roughly
equivalent in speed to HDFS running a mapred via the hadoop-examples jar from .20 using the
default values for both systems; Ceph tends to be slightly faster in a put and slightly slower
in the mapred (~3:35 versus ~3:20 on the 6GB test case). However, Ceph, while still highly
experimental and in-development, is a full filesystem with a linux kernel and full userspace
client; it also distinguishes itself from HDFS by having no single point of failure -- it
uses a paxos-based monitor cluster for managing state and multiple metadata servers instead
of the single HDFS namenode (though of course you can also run the entire system on one machine).

> Add a Ceph FileSystem interface.
> --------------------------------
>                 Key: HADOOP-6253
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6253
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: fs
>            Reporter: Gregory Farnum
>            Priority: Minor
> The experimental distributed filesystem Ceph does not have a single point of failure,
and might be of use to some Hadoop users.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message