hadoop-yarn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Clay B." <...@clayb.net>
Subject Re: [VOTE] Merging branch HDFS-7240 to trunk
Date Thu, 01 Mar 2018 22:45:08 GMT
Oops, retrying now subscribed to more than solely yarn-dev.


On Wed, 28 Feb 2018, Clay B. wrote:

> +1 (non-binding)
> I have walked through the code and find it very compelling as a 
> user; I really look forward to seeing the Ozone code mature and 
> it maturing HDFS features together. The points which excite me 
> as an eight year HDFS user are:
> * Excitement for making the datanode a storage technology 
> container - this
>  patch clearly brings fresh thought to HDFS keeping it from 
> growing stale
> * Ability to build upon a shared storage infrastructure for 
> diverse
>  loads: I do not want to have "stranded" storage capacity or 
> have to
>  manage competing storage systems on the same disks (and 
> further I want
>  the metrics datanodes can provide me today, so I do not have 
> to
>  instrument two systems or evolve their instrumentation 
> separately).
> * Looking forward to supporting object-sized files!
> * Moves HDFS in the right direction to test out new block 
> management
>  techniques for scaling HDFS. I am really excited to see the 
> raft
>  integration; I hope it opens a new era in Hadoop matching 
> modern systems
>  design with new consistency and replication options in our 
> ever
>  distributed ecosystem.
> -Clay
> On Mon, 26 Feb 2018, Jitendra Pandey wrote:
>>    Dear folks,
>>           We would like to start a vote to merge HDFS-7240 
>> branch into trunk. The context can be reviewed in the 
>> DISCUSSION thread, and in the jiras (See references below).
>>    HDFS-7240 introduces Hadoop Distributed Storage Layer 
>> (HDSL), which is a distributed, replicated block layer.
>>    The old HDFS namespace and NN can be connected to this new 
>> block layer as we have described in HDFS-10419.
>>    We also introduce a key-value namespace called Ozone built 
>> on HDSL.
>>    The code is in a separate module and is turned off by 
>> default. In a secure setup, HDSL and Ozone daemons cannot be 
>> started.
>>    The detailed documentation is available at
>>             https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+Distributed+Storage+Layer+and+Applications
>>    I will start with my vote.
>>            +1 (binding)
>>    Discussion Thread:
>>              https://s.apache.org/7240-merge
>>              https://s.apache.org/4sfU
>>    Jiras:
>>               https://issues.apache.org/jira/browse/HDFS-7240
>>               https://issues.apache.org/jira/browse/HDFS-10419
>>               https://issues.apache.org/jira/browse/HDFS-13074
>>               https://issues.apache.org/jira/browse/HDFS-13180
>>    Thanks
>>    jitendra
>>            On 2/13/18, 6:28 PM, "sanjay Radia" 
>> <sanjayosrc@gmail.com> wrote:
>>                Sorry the formatting got messed by my email 
>> client.  Here it is again
>>                Dear
>>                 Hadoop Community Members,
>>                   We had multiple community discussions, a few 
>> meetings in smaller groups and also jira discussions with 
>> respect to this thread. We express our gratitude for 
>> participation and valuable comments.
>>                The key questions raised were following
>>                1) How the new block storage layer and OzoneFS 
>> benefit HDFS and we were asked to chalk out a roadmap towards 
>> the goal of a scalable namenode working with the new storage 
>> layer
>>                2) We were asked to provide a security design
>>                3)There were questions around stability given 
>> ozone brings in a large body of code.
>>                4) Why can?t they be separate projects forever 
>> or merged in when production ready?
>>                We have responded to all the above questions 
>> with detailed explanations and answers on the jira as well as 
>> in the discussions. We believe that should sufficiently 
>> address community?s concerns.
>>                Please see the summary below:
>>                1) The new code base benefits HDFS scaling and 
>> a roadmap has been provided.
>>                Summary:
>>                  - New block storage layer addresses the 
>> scalability of the block layer. We have shown how existing NN 
>> can be connected to the new block layer and its benefits. We 
>> have shown 2 milestones, 1st milestone is much simpler than 
>> 2nd milestone while giving almost the same scaling benefits. 
>> Originally we had proposed simply milestone 2 and the 
>> community felt that removing the FSN/BM lock was was a fair 
>> amount of work and a simpler solution would be useful
>>                  - We provide a new K-V namespace called Ozone 
>> FS with FileSystem/FileContext plugins to allow the users to 
>> use the new system. BTW Hive and Spark work very well on 
>> KV-namespaces on the cloud. This will facilitate stabilizing 
>> the new block layer.
>>                  - The new block layer has a new netty based 
>> protocol engine in the Datanode which, when stabilized, can be 
>> used by  the old hdfs block layer. See details below on 
>> sharing of code.
>>                2) Stability impact on the existing HDFS code 
>> base and code separation. The new block layer and the OzoneFS 
>> are in modules that are separate from old HDFS code - 
>> currently there are no calls from HDFS into Ozone except for 
>> DN starting the new block  layer module if configured to do 
>> so. It does not add instability (the instability argument has 
>> been raised many times). Over time as we share code, we will 
>> ensure that the old HDFS continues to remains stable. (for 
>> example we plan to stabilize the new netty based protocol 
>> engine in the new block layer before sharing it with HDFS?s 
>> old block layer)
>>                3) In the short term and medium term, the new 
>> system and HDFS  will be used side-by-side by users. Side 
>> by-side usage in the short term for testing and side-by-side 
>> in the medium term for actual production use till the new 
>> system has feature parity with old HDFS. During this time, 
>> sharing the DN daemon and admin functions between the two 
>> systems is operationally important:
>>                  - Sharing DN daemon to avoid additional 
>> operational daemon lifecycle management
>>                  - Common decommissioning of the daemon and 
>> DN: One place to decommission for a node and its storage.
>>                  - Replacing failed disks and internal 
>> balancing capacity across disks - this needs to be done for 
>> both the current HDFS blocks and the new block-layer blocks.
>>                  - Balancer: we would like use the same 
>> balancer and provide a common way to balance and common 
>> management of the bandwidth used for balancing
>>                  - Security configuration setup - reuse 
>> existing set up for DNs rather then a new one for an 
>> independent cluster.
>>                4) Need to easily share the block layer code 
>> between the two systems when used side-by-side. Areas where 
>> sharing code is desired over time:
>>                  - Sharing new block layer?s  new netty based 
>> protocol engine for old HDFS DNs (a long time sore issue for 
>> HDFS block layer).
>>                  - Shallow data copy from old system to new 
>> system is practical only if within same project and daemon 
>> otherwise have to deal with security setting and coordinations 
>> across daemons. Shallow copy is useful as customer migrate 
>> from old to new.
>>                  - Shared disk scheduling in the future and in 
>> the short term have a single round robin rather than 
>> independent round robins.
>>                While sharing code across projects is 
>> technically possible (anything is possible in software),  it 
>> is significantly harder typically requiring  cleaner public 
>> apis etc. Sharing within a project though internal APIs is 
>> often simpler (such as the protocol engine that we want to 
>> share).
>>                5) Security design, including a threat model 
>> and and the solution has been posted.
>>                6) Temporary Separation and merge later: 
>> Several of the comments in the jira have argued that we 
>> temporarily separate the two code bases for now and then later 
>> merge them when the new code is stable:
>>                  - If there is agreement to merge later, why 
>> bother separating now - there needs to be to be good reasons 
>> to separate now.  We have addressed the stability and 
>> separation of the new code from existing above.
>>                  - Merge the new code back into HDFS later 
>> will be harder.
>>                    **The code and goals will diverge further.
>>                    ** We will be taking on extra work to split 
>> and then take extra work to merge.
>>                    ** The issues raised today will be raised 
>> all the same then.
>>                ---------------------------------------------------------------------
>>                To unsubscribe, e-mail: 
>> hdfs-dev-unsubscribe@hadoop.apache.org
>>                For additional commands, e-mail: 
>> hdfs-dev-help@hadoop.apache.org
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
>> For additional commands, e-mail: 
>> yarn-dev-help@hadoop.apache.org

To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-dev-help@hadoop.apache.org

View raw message