sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jar...@apache.org
Subject Re: Sqoop CI changes
Date Fri, 14 Apr 2017 15:32:11 GMT
My apologies for jumping on this email thread a bit late. As I wrote the original pre-commit
hook for Sqoop 2, I wanted to provide a bit of a context. Well, I have to say that I did not
really wrote the hook myself, since I’ve “borrowed” a lot of code from other Apache
projects using the hook, but that is a different story :)

The first version of the hook run just a unit tests and nothing more - it did however had
an immediate impact. At that time I was the primary reviewer for most stuff and it unblocked
all contributors as they got almost immediate feedback. It was far from perfect, but it was
much better then nothing. It took few months to iterate on the hook to really bring it where
we wanted it to be.

Hence if someone have time to looking into enabling the same for Sqoop 1, I would highly encourage
that :) I’m even happy to help explaining how the hook works on Sqoop 2. Introducing it
will be iterative process anyway, so the third party tests can be added whenever the infrastructure
will be ready for them. 

Jarcec

> On Apr 7, 2017, at 12:57 PM, Attila Szabó <maugli@apache.org> wrote:
> 
> Hey Anna,
> 
> Thanks for your quick reply!
> 
> I'm not sure if I can follow you, on the Kudu team topic, but if that means
> either we'd like to include some of their working solutions in our CI
> system, or we'd like to do a working Kudu integration, for we need better
> CI, I would say +1 for both of them, and would be more than happy to help
> your efforts on that front. :-)
> 
> My original idea briefly would be for the 3rd party automation:
> - As a step 0 fire up docker containers with the related DB dependencies
> before the whole cycle or before a good organized group of tests (e.g. one
> group of MySQL, Oracle, etc.)
> - After the tests executed terminate the containers.
> 
> Other solution would be possible like:
> - Somehow getting access and resources from the Apache community to have
> dedicated/global CI DB servers for Sqoop
> - Ask any of the Hadoop vendors or the Apache sponsors to provide us pre
> installed server infrastructure
> - Acquire server instances and deploy them with Terraform+Ansible, etc.
> 
> The problem with this "second approach" is that it would need external
> resources involved and we would need someone who would sponsor our HW
> resources. On the other hand the benefits would be that on real bare metal
> resources we would be able to perform meaningful performance tests in the
> future as well.
> 
> The original plan:
> I have no concerns, but still would like to highlight, that IMHO only the
> pre commit hook would be urgent, and the rest of your energies would be
> better focused on finishing the new build system, or rather on the solution
> 3rd party resource problem we've discussed above.
> 
> And as always: if you would need my help in any of your efforts please do
> not hesitate to ping me here, or on the related JIRA task.
> 
> My 2cents,
> Attila
> 
> On Fri, Apr 7, 2017 at 8:34 PM, Anna Szonyi <szonyi@cloudera.com> wrote:
> 
>> Hey Attila,
>> 
>> Thanks for your input! I agree that adding the 3rd party tests to the CI
>> would be really beneficial. As there are some resourcing problems that need
>> to be solved for that to happen (I started a discussion with the Kudu team,
>> however if you have any specific help to add there, I would be very happy
>> to accept any assistance with that) - I'll start a separate thread where we
>> can discuss this issue with the community, see if anyone else has any
>> inputs on it :).
>> 
>> However if no one has any objections about the original proposal, I will
>> follow through with that in the meantime.
>> 
>> Thanks and Regards,
>> Anna
>> 
>> 
>> On Fri, Apr 7, 2017 at 1:04 PM, Attila Szabó <maugli@apache.org> wrote:
>> 
>>> Hello everyone,
>>> 
>>> First of all I'd like to thank that Anna is willing to invest some
>> efforts
>>> on making our CI system better, and sorry for my delayed answer (
>> however I
>>> do still hope it helps regardless the stage of the ongoing efforts )
>>> 
>>> I'd like to share the following thoughts here:
>>> - Although it would make sense to eliminate the four ( right now totally
>>> equal ) CI cycles and create only one, but regardless some "static noise"
>>> it doesn't cause any serious issues for our current commit flow.
>>> - Creating a precommit hook sounds like a great idea, I would encourage
>> the
>>> community to move on that path.
>>> - However: According to my humblest opinion the biggest problem with the
>>> current CI is that it doesn't execute the so called " 3rd party tests" (
>>> which is generally our DB integration test layer), and thus it provides
>>> only a limited safety belt for us ( and we've seen quite a few regression
>>> on this front in the past one year ). Although we do have solution for
>>> running those tests manually from command line, it's quite difficult to
>>> setup/test those things from a single desktop, thus cause serious
>>> difficulties in validating some changeset before commit.
>>> - Still an issue we have on this front is our build
>>> system/scripts/mechanism, which again could slow down the Dev+commit
>> flow.
>>> Although we've started efforts on this front, the final solution was not
>>> fully delivered yet.
>>> 
>>> As a conclusion of the elements above:
>>> Anna! Would you mind first focusing on the build scripts and the "3rd
>>> party" CI automation instead of eliminating the obsolete stuff? IMHO that
>>> would be a much better usage of your efforts and would provide a much
>>> bigger impact for the community.
>>> 
>>> With my kindest regards,
>>> Attila
>>> 
>>> On Mar 28, 2017 2:44 PM, "Erzsebet Szilagyi" <liz.szilagyi@cloudera.com>
>>> wrote:
>>> 
>>>> Great ideas!
>>>> 
>>>> I agree with Bogi and Szabolcs on the redundant test jobs.
>>>> 
>>>> Would this pre-commit hook launch the same process as the current
>>>> post-commit hook, or would this do something different?
>>>> I think in the first case we could rework the post-commit check into
>> the
>>>> pre-commit hook, in the latter I'm curious about what exactly this
>> check
>>>> would add.
>>>> In general I support the idea: we have seen a number of problems that
>>> could
>>>> have been avoided, so this shall be a very useful change!
>>>> 
>>>> Thank you,
>>>> Liz
>>>> 
>>>> On Fri, Mar 24, 2017 at 9:39 AM, Szabolcs Vasas <vasas@cloudera.com>
>>>> wrote:
>>>> 
>>>>> Hi Anna,
>>>>> 
>>>>> Removing the redundant test execution jobs sounds great, I think you
>>> can
>>>> go
>>>>> ahead with that.
>>>>> 
>>>>> Regarding the pre-commit hook: what would be the purpose of it
>> exactly?
>>>>> Would it execute the unit tests before the patch is committed?
>>>>> 
>>>>> Regards,
>>>>> Szabolcs
>>>>> 
>>>>> On Thu, Mar 23, 2017 at 4:03 PM, Anna Szonyi <szonyi@cloudera.com>
>>>> wrote:
>>>>> 
>>>>>> Hi All,
>>>>>> 
>>>>>> I would like to make the following changes to the Sqoop CI system:
>>>>>> Disable the SCM polling for the Sqoop-hadoop23 Sqoop-hadoop20 and
>>>>>> Sqoop-hadoop100 jobs (and later delete the jobs themselves),
>>>>>> as the current trunk version of sqoop no longer contains these
>>>> profiles,
>>>>> so
>>>>>> these runs are redundant.
>>>>>> 
>>>>>> I would also like to propose the creation of a pre-commit hook for
>>>> Sqoop
>>>>>> (like the existing one for Sqoop2).
>>>>>> 
>>>>>> Please let me know if you have any objections.
>>>>>> 
>>>>>> Thanks,
>>>>>> Anna
>>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> Szabolcs Vasas
>>>>> Software Engineer
>>>>> <http://www.cloudera.com>
>>>>> 
>>>> 
>>> 
>> 


Mime
View raw message