spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jakub Wozniak <>
Subject Re: [VOTE] Release Apache Spark 2.4.1 (RC2)
Date Thu, 07 Mar 2019 14:57:00 GMT

I have a question regarding the 2.4.1 release.

It looks like Spark 2.4 (and 2.4.1-rc) is not exactly compatible with Hbase 2.x+ for the Yarn
The problem is in the class
that expects a specific version of TokenUtil class from Hbase that was changed between Hbase
1.x & 2.x.
On top the HadoopDelegationTokenManager does not use the ServiceLoader class so I cannot attach
my own provider (providers are hardcoded).

It seems that both problems are resolved on the Spark master branch.

Is there any reason not to include this fix in the 2.4.1 release?
If so when do you plan to release it (the fix for Hbase)?

Or maybe there is something I’ve overlooked, please correct me if I’m wrong.

Best regards,

On 7 Mar 2019, at 03:04, Saisai Shao <<>>

Do we have other block/critical issues for Spark 2.4.1 or waiting something to be fixed? I
roughly searched the JIRA, seems there's no block/critical issues marked for 2.4.1.


shane knapp <<>> 于2019年3月7日周四
i'll be popping in to the sig-big-data meeting on the 20th to talk about stuff like this.

On Wed, Mar 6, 2019 at 12:40 PM Stavros Kontopoulos <<>>
Yes its a touch decision and as we discussed today (
"Kubernetes support window is 9 months, Spark is two years". So we may end up with old client
versions on branches still supported like 2.4.x in the future.
That gives us no choice but to upgrade, if we want to be on the safe side. We have tested
3.0.0 with 1.11 internally and it works but I dont know what it means to run with old

On Wed, Mar 6, 2019 at 7:54 PM Sean Owen <<>>
If the old client is basically unusable with the versions of K8S
people mostly use now, and the new client still works with older
versions, I could see including this in 2.4.1.

Looking at
it seems like the 4.1.1 client is needed for 1.10 and above. However
it no longer supports 1.7 and below.
We have 3.0.x, and versions through 4.0.x of the client support the
same K8S versions, so no real middle ground here.

1.7.0 came out June 2017, it seems. 1.10 was March 2018. Minor release
branches are maintained for 9 months per

Spark 2.4.0 came in Nov 2018. I suppose we could say it should have
used the newer client from the start as at that point (?) 1.7 and
earlier were already at least 7 months past EOL.
If we update the client in 2.4.1, versions of K8S as recently
'supported' as a year ago won't work anymore. I'm guessing there are
still 1.7 users out there? That wasn't that long ago but if the
project and users generally move fast, maybe not.

Normally I'd say, that's what the next minor release of Spark is for;
update if you want later infra. But there is no Spark 2.5.
I presume downstream distros could modify the dependency easily (?) if
needed and maybe already do. It wouldn't necessarily help end users.

Does the 3.0.x client not work at all with 1.10+ or just unsupported.
If it 'basically works but no guarantees' I'd favor not updating. If
it doesn't work at all, hm. That's tough. I think I'd favor updating
the client but think it's a tough call both ways.

On Wed, Mar 6, 2019 at 11:14 AM Stavros Kontopoulos
> Yes Shane Knapp has done the work for that already,  and also tests pass, I am working
on a PR now, I could submit it for the 2.4 branch .
> I understand that this is a major dependency update, but the problem I see is that the
client version is so old that I dont think it makes
> much sense for current users who are on k8s 1.10, 1.11 etc(,
3.0.0 does not even exist in there).
> I dont know what it means to use that old version with current k8s clusters in terms
of bugs etc.

Shane Knapp
UC Berkeley EECS Research / RISELab Staff Technical Lead<>

View raw message