hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Binglin Chang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-10388) Pure native hadoop client
Date Thu, 27 Mar 2014 09:12:18 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-10388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13949062#comment-13949062
] 

Binglin Chang commented on HADOOP-10388:
----------------------------------------

Thanks for posting this Colin, looking into the code right now. [~wenwu] and I both got branch
committer invitation today. His is interest in providing more test for the feature. 
About the code and created sub-jiras, here are some initial questions:
# What will the project structure looks like? A separate top-level hadoop-native-client-project?
Or seperate code files in common/hdfs/yarn existing dirs?
# Why the name libhdfs-core.so and libyarn-core.so? it's a client library, doesn't sounds
like core.
# I'm surprised the code turn to pure c, it seems because of this, we are introducing strange
libraries and tools(protobuf-c(last release in 2011) and the tool shorten),  about test library,
cpp library gtest is not going to be used too? In short, what libraries are planned to be
used?
# I like the library to be lightweight, some people just want a header file and a static linked
library(a few MB in size), to be able to read/write from hdfs, so some heavy feature: xml
library(config file parsing), uri parsing(cross FileSystem symlink), thread pool better be
optional, not required.



> Pure native hadoop client
> -------------------------
>
>                 Key: HADOOP-10388
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10388
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: HADOOP-10388
>            Reporter: Binglin Chang
>            Assignee: Colin Patrick McCabe
>
> A pure native hadoop client has following use case/advantages:
> 1.  writing Yarn applications using c++
> 2.  direct access to HDFS, without extra proxy overhead, comparing to web/nfs interface.
> 3.  wrap native library to support more languages, e.g. python
> 4.  lightweight, small footprint compare to several hundred MB of JDK and hadoop library
with various dependencies.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message