Hey Hyukjin,Apologies for sending this to you twice. : o)On Tue, Aug 6, 2019 at 9:55 AM Hyukjin Kwon <email@example.com> wrote:Myrle,
> We need to balance two sets of risks here. But in the case of access to our software artifacts, the risk is very small, and already has *multiple* mitigating factors, from the fact that all changes are tracked to an individual, to the fact that there are notifications sent when changes are made, (and I'm going to stop listing the benefits of a modern source control system here, because I know you are aware of them), on through the fact that you have automated tests, and continuing through the fact that there is a release process during which artifacts get checked again.> If someone makes a commit who you are not expecting to make a commit, or in an area you weren't expecting changes in, you'll notice that, right?> What you're talking about here is your security model for your source repository. But restricting access isn't really the right security model for an open source project.I don't quite get the argument about commit bit. I _strongly_ disagree about "the risk is very small,".
Not all of committers track all the changes. There are so many changes in the upstream and it's already overhead to check all.
Do you know how many bugs Spark faces due to such lack of reviews that entirely blocks the release sometimes, and how much it takes time to fix up such commits?
We need expertise and familiarity to Spark.Let's unroll that a bit. Say that you invite a non-coding contributor to be a committer. To make an inappropriate commit two things would have to happen: this person would have to decide to make the commit, and this person would have to set up access to the git repository, either by enabling gitbox integration, or accessing the apache git repository directly. Before you invite them you make an estimation of the probability that they would do the first: that is decide to make an inappropriate commit. You decide that that is fairly unlikely. But for a non-coding contributor the chances of them actually going through the mechanics of making a commit is even more unlikely. I think we can safely assume that the chance of someone who you've determined is committed to the community and knows their limits of doing this is simply 00.00%.That leaves the question of what the chance is that this person will leak their credentials to a malicious third party intent on introducing bugs into Spark code. Do you believe there are such malicious third parties? How many attacks have there been on Spark committer credentials? I believe the likelihood of this happening is 00.00% (but I am willing to be swayed by evidence otherwise -- should probably be discussed on the private@ list though if it's out there.: o).But let's say I'm wrong about both of those probabilities. Let's say the combined probability of one of those two things happening is actually 0.01%. This is where the advantages of modern source control and tests come in. Even if there's only a 50% chance that watching commits will catch the error, and only a further 50% chance that tests will catch the error, and only a further 50% chance that the error will be caught in release testing, those chances multiply out at 00.00125%.Based on those guestimates the risk is somewhere between 00.00% and 00.00125%. The risk is very small. You take bigger risks every day in order to move your project forward.It virtually means we will add some more overhead to audit each commit, even for committers'. Why should we bother add such overhead to harm the project?
To me, this is the most important fact. I don't think we should just count the number of positive and negative ones.Based on this argumentation you will never invite any committers or even merge any pull requests.But you do invite committers and you do merge pull requests because it's good for your project. Because the risk of doing nothing is greater.For other reasons, we can just add or discuss about the "this kind of in-between status Apache-wide", which is a bigger scope than here. You can ask it to ASF and discuss further.I can say with considerable confidence: There will be no "in-between" status Apache-wide. But if you disagree, and want to start a discussion to suggest that, firstname.lastname@example.org is a good place to go with it.Best Regards,Myrle