hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arun Murthy <...@hortonworks.com>
Subject Re: MapReduce and MPI
Date Wed, 21 Dec 2011 07:08:10 GMT
Sounds great! Thanks for the update Ralph!

Sent from my iPhone

On Dec 20, 2011, at 10:22 PM, Ralph Castain <rhc@open-mpi.org> wrote:

> Just a quick update on this notion. Several of us in the OMPI community got together
and successfully integrated Java bindings into the OMPI code base, and we have enough support
that we can probably get this approved within that organization. I've written a wrapper compiler
and added support within mpirun to make it relatively easy to use, so what remains is documentation
(hope to have an initial cut at that done on Wed) and extending coverage to all MPI functions
(we have send/recv and a number of other basic things done, but still need collectives and
MPI-2 dynamics). The latter will be a work-in-progress (there are a LOT of MPI functions),
with the more common functions covered over the next few weeks.
> We also need test codes, of course, and could use help with generating those plus actual
testing.  Volunteers are welcome. There are several Fortran and C test suites out there that
are rather extensive - having some subset of those in Java would be a major step forward.
I can point you to the branch where this work is being done (it is public, with controlled
write privileges) and provide example tests on request.
> As for the 3.0 standard, that is indeed out-of-reach. The MPI Forum requires 9 months
lead time for approval of any new proposal, and the 3.0 approval meeting is in Jan. However,
this is a continuous process with revisions being released on a quarterly basis. So it isn't
a "hit the date or die" issue - it is strictly a question of persevering long enough to gain
acceptance, and the pace of the process will largely be driven by the level of user interest.
> Ralph
> On Dec 1, 2011, at 3:15 PM, Ralph Castain wrote:
>> On Dec 1, 2011, at 2:47 PM, <Milind.Bhandarkar@emc.com> wrote:
>>> Ralph,
>>> At the MPI Forum meeting at SC11, Jeff mentioned that C++ bindings are
>>> going to be dropped from the standard,
>> Yes - reason being mostly that (a) very few applications use them, and (b) they have
proven to be more trouble than they are worth. We are constantly finding bugs due to conflicts
between MPI specifications and C++ compilers, and (quite frankly) the lack of experienced
C++ programmers in the MPI developer community is a serious problem. So keeping those bindings
alive is difficult.
>>> and that no other language bindings
>>> were proposed. Do you think there is enough time for Java bindings to make
>>> it into the 3.0 standard ?
>> I don't know about the 3.0 standard -  could happen, if I can do it fast enough and
the Forum accepts it for that release, or may have to follow in 3.1. The obstacle we have
to overcome re the Forum is that Java got a bad name in the early years of the binding attempts
due to performance issues and lack of attention to details. The performance "problem" largely
stemmed from the issue of binding processes to at least NUMA regions - the C implementations
were far faster - and the poor performance of Java in general during that time. The latter
has largely been resolved over the years, and the former is solvable with some work.
>> The detail issue reflected the problem of trying to create a single, non-sectarian
set of Java bindings that fit all MPI implementations. This meant that you could really only
cover 90% of MPI functionality - beyond that, you have to integrate tightly to the implementation.
The academics who did the original work didn't want to do so, and thus left functions out,
resulting in the MPI community "looking down" on the result.
>> All put together, the MPI community wound up not thinking much of the Java world.
As I said, things have changed, and I believe a high-quality implementation of Java bindings
can gain acceptance. Once we have it for one MPI, we can (due to the OMPI license) offer it
up to the other implementations with a fair degree of confidence they will adopt it.
>> As for the MPI Forum, what we need is a "champion" to propose adoption of the bindings
once implemented. If people want them (i.e., the user community is larger than C++, which
has a total of 3 identified applications), we can show the implementation is of quality, and
we have developers willing to support it, then we can get them adopted.
>> I've scoped the job and it looks doable with reasonable effort. One other person
on the list (Deepak Sharma) has offered to help, and Jeff has offered to provide advice as
he wrote the original OMPI bindings. Getting it thru the OMPI devel approval represents a
miniature MPI Forum process, but I think we can do it given Jeff and my roles there.
>> HTH
>> Ralph
>>> - Milind
>>> On 12/1/11 3:31 AM, "Ralph Castain" <rhc@open-mpi.org> wrote:
>>>> Hi folks
>>>> I'm a lead developer on the Open MPI project, and recently joined the
>>>> Hadoop community to help with Hamster. A couple of people have asked me
>>>> about using MPI more generally inside MapReduce, and it does indeed seem
>>>> a good candidate to use that method of communication.
>>>> It seems to me, though, that a pre-requisite for moving that direction is
>>>> a good set of Java MPI bindings. I've checked with the MPI community, and
>>>> while there have been a couple of attempts at this, nothing really
>>>> "adopted" by the community has been done.
>>>> I have some spare time while I search for a new job, so I thought I might
>>>> tackle this over the next few weeks. However, I thought I would assess
>>>> the actual level of interest before investing the time - it isn't a
>>>> trivial thing to do, but doable if there is interest.
>>>> Thanks
>>>> Ralph

View raw message