hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Loughran <ste...@cloudera.com.INVALID>
Subject Re: [VOTE] Release Apache Hadoop 3.3.1 RC3
Date Thu, 03 Jun 2021 19:51:03 GMT
Extend for a bit from the last RC, as it takes time to qualify.

I'm busy testing, doing

- packaging &c
- CLI working with abfs and s3, both fs and cloudstore library calls
- building downstream projects (so validating maven artifacts). cloudstore
and spark there
- building downstream of downstream projects, i.e. my spark cloud
IO/committer test module. Moving to spark 3 cost me the afternoon, not
through any incompatible changes there but because the upgraded scalatest
"moved" their foundational FunTest class to a different package and name.
Not happy with Team Scalatest there.
- reviewing the docs in the -aws and azure modules to see they link
together OK.

So far so good.

One troublespot (which isn't any reason to hold up the release), is that
the table in the directory_markers markdown file doesn't render right.
 Created https://issues.apache.org/jira/browse/HADOOP-17746.

This is *not a blocker*

I can prepare a fix and we can have it in so that if any other changes come
in the page will look OK.

On Thu, 3 Jun 2021 at 17:30, Wei-Chiu Chuang <weichiu@apache.org> wrote:

> Hello,
> do we want to extend the release vote? I understand a big release like this
> takes time to validate.
> I am aware a number of people are testing it: Attila tested Ozone on Hadoop
> 3.3.1 RC3, Stack is testing HBase, Chao tested Spark.
> I also learned that anecdotally Spark on S3 on Hadoop 3.3 is faster by 20%
> over Hadoop 3.2 library.

ooh. That'll be from Mukund's listing improvements translating into query
planning speedups,


If someone benchmarking this stuff were to enable directory marker retention
fs.s3a.directory.marker.retention=keep , I'd be interested to know how much
speedup that
delivers on versioned and unversioned buckets.

Unversioned: reduces risk of IO throttling on writes
Versioned: that and should stop subsequent LIST operations from getting
slowed down from all the tombstones

> Looks like we may need some more time to test. How about extending it by a
> week?

That would be good. This week included some holidays for people in the
US/UK which is why I'm a bit behind on my testing.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message