hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-10388) Pure native hadoop client
Date Mon, 10 Mar 2014 18:27:50 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-10388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13926006#comment-13926006

Colin Patrick McCabe commented on HADOOP-10388:

bq. Binglin said: I mention c+\+11 mostly for the new std libraries(thread, lock/condition,
random, unique_ptr/shared_ptr, regex), so we can avoid writing lot of common utility code,
it's fine we use boost instead, and provide typedefs so c++11 or boost can both be an option,
old compiler can use boost instead, new compiler can avoid boost dependency.

{{shared_ptr}} is in tr1.  Every compiler in use today should have it.  random is pretty straightforward
with {{rand_r}}-- hardly a reason to pull in dependencies.  For the rest of the stuff, we
should just have thin wrappers around the POSIX or Windows functions, I think.

I don't think we should depend on boost at any point, since it introduces too many compatibility
issues.  Boost simply doesn't maintain good compatibility across versions.  And then there's
issues like what happens if the code using your library is also linking against a different
version of boost?  It just doesn't work very well.

It's important to remember that we're writing a library here that clients will use, not a
stand-alone application.  That means we need to be careful not to assume too much about the
context we're running in.  Ideally, we'd have only the dependencies that we really need, and
we'd provide the ability to shut down the library or run multiple instances of it from different
threads of the client application.

bq. Steve said: I'd like it to build on OS/X so that mac builds catch regressions, even if
isn't for production.

Yeah, it would be nice to have a cross-platform client.  I don't have easy access to MacOS
(it's proprietary and I don't run it, although some of my co-workers do), but I do like to
compile things on FreeBSD to see how things go.  We should keep portability in mind.

bq. Although I havent try other test frameworks, I would recommend gtest, it is small and
convenient(just a .cc file can embed into test program). If we are using google c++ coding
standard, protobuf, using another google framework seems natural.

Yeah, gtest would be a nice test framework for this.

> Pure native hadoop client
> -------------------------
>                 Key: HADOOP-10388
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10388
>             Project: Hadoop Common
>          Issue Type: New Feature
>            Reporter: Binglin Chang
> A pure native hadoop client has following use case/advantages:
> 1.  writing Yarn applications using c++
> 2.  direct access to HDFS, without extra proxy overhead, comparing to web/nfs interface.
> 3.  wrap native library to support more languages, e.g. python
> 4.  lightweight, small footprint compare to several hundred MB of JDK and hadoop library
with various dependencies.

This message was sent by Atlassian JIRA

View raw message