airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kamil BreguĊ‚a <kamil.breg...@polidea.com>
Subject Re: [DISCUSS] Moving stuff from CWiki to Github ?
Date Tue, 04 Aug 2020 05:36:17 GMT
I think that each of its tools has its purpose and strengths.
- Github Issue allows you to gather detailed information on many issues and
discuss specific topics. This seems rather obvious to everyone.
- Github Labels allow us to group cards based on topic, category or other
criteria., e.g. operators, security and this allows us to find a ticket
without looking at the title. The labels don't have any end date. This too
is commonly used by us.
- Github Projects allows us to plan your work, but I think it works best
when there are not all the tasks there, but only those that are planned to
be implemented in a short period of time, i.e. spring - one or two weeks,
one month. There shouldn't be a lot of cards in there because it's easy to
get lost. At the end of each time period, all cards should be in the done
section.
- Github Milestone allows us to group all tasks into one large bag. Here we
can have up to 200 cards and can describe even the smallest tickets. It is
also important that Milestone has an end, and it should be closed one day.
In practice, this can be replaced by Github labels.
- Github Wiki: This is an alternative to cwiki and may contain anything
that does not match the other tools. I think, we should limit the number of
documents because I suspect not many people will look here because we are
code-oriented, so it's best to keep everything close to the code.

There is still the subject of various documents that we use in the project.
I can distinguish the following types:

a) Meeting notes: This document can be prepared before the meeting to
present the agenda or briefly the topic that will be discussed.  Internally
in Polidea, we call this document "Pre-demo docs"
I think it's good when it has 3 sections:
- What changed? It's good for each person to describe what they managed to
do before the between one meeting and the current one.
- Open questions/Discussion topics
- What will happen in the future? Action point/Plan
I like it when these documents are updated by one person - the meeting
organizer. If you are not interested in the topic of the meeting on a given
day, you can skip and just read the note.

b) Status Worksheet: It contains information about the criteria that must
be met in order to achieve a given goal. It's based on design docs and
contains The high-level status of the entire project. There is only one
such sheet for an entire project from its beginning and end, and it is good
to keep it updated at regular meetings or when the person has completed a
task. Everyone can contribute to this document.
Example:
https://github.com/apache/airflow/issues/10085

c) Design docs/AIP: These documents describe detailed information about a
project. This document describes all design decisions and allows new people
to learn about the topic easily. This document is valuable for a long time
and it is worth keeping an archive of these documents, but most of the
information in this document should also be found in the end-user
documentation. In some cases, this information may be contained in an
isolated "Internals" section. The author is responsible for maintaining
this document. In some cases, a document may have multiple authors, but I
think it's worth asking the original author for permission.
Example:
https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals

d) Other information documents: We should limit their number. When it is
possible, archive it or move it to other places, e.g. website or project
documentation.

I propose to use the following set of tools:
- Design docs/AIP: If needed. For the Airflow 2.0 release, I don't think we
need it. They should be on the wiki. We discuss these documents according
to Apache rules on the mailing list.
- Status worksheet/Meta-issue: I think it is worth starting to use it so
that people who do not follow the activity of the entire project can find
out about its status. In this document, we can also briefly outline the
rules of contributions. Some people want to work in public - in the
community. Others prefer to work on private branches and show only the
final effect. These two models of contribution are fine as long as we don't
have a misunderstanding of expectations. I think that such a ticket should
be created after accepting a given AIP. It is a good idea to make such a
sheet as close as possible to the tasks and include references to them. So
the natural choice is the Github Issue. Such a ticket was required before
in the AIP template, but there was no recommendation as to its content.
- Meetings notes: If any SIG groups need it, they should use it. We can
recommend that they are prepared because it eliminates the barriers to
joining such meetings and if meetings are regular, they will force people
to be prepared for meetings. However, it all depends on the specifics of
each project. Personally, I don't like face-to-face meetings and prefer
other ways of working together.  I would not like the meetings to be
compulsory because I am afraid that this could create a very large barrier
to entry. I think, all topics can also be discussed asynchronously.

- Github Milestone:  If any SIG groups need it, they should use it. As
Airflow, it's worth keeping track of releases this way.
- Github Projects:  If any SIG groups need it, they should use it. As
Airflow, we don't need it.
- Github Issue for issue and meta-issue/status worksheet.
- Github Wiki/Wiki/Separate repo for all other docs. e.g. AIP, Meetings
notes.

I hope we are on the same page now and I am very curious about your
opinions.



On Mon, Aug 3, 2020 at 7:28 PM Jarek Potiuk <Jarek.Potiuk@polidea.com>
wrote:

> That's an interesting idea - but I think there is one problem with that -
> it will get outdated much more easily. This is main reason why docs should
> be as close to code as possible - when you change something in the code
> like refactor etc. it's much easier to also modify the documentation - and
> contributing doc has a lot of references to some parts of the code that
> change quite a bit. The only easy way to change wiki is via github
> interface or API, I'd prefer to keep it in the code.
>
> BTW. I do not have a strong opinion if we should mode to Github Wiki but I
> propose - let's clean-up the CWIKI first and see if it's worth to move it.
>
>
> J.
>
> On Mon, Aug 3, 2020 at 5:08 PM Tomasz Urbaszek <turbaszek@apache.org>
> wrote:
>
> > I'm ok with the proposed approach.
> >
> > Btw. if we want to use Githu wiki then maybe we can move there all
> > contributing guides etc? I personally found it easier to navigate
> > between small pages than long documents with no easy access to the
> > navigation menu.
> >
> > T.
> >
> >
> > On Mon, Aug 3, 2020 at 4:58 PM Jarek Potiuk <Jarek.Potiuk@polidea.com>
> > wrote:
> > >
> > > Tomek, I would love to leave aside ADR snapshots for now - we can have
> a
> > > separate discussion about it later on how to do it.
> > >
> > > I think we need to clean up the mess we have now in CWIKI. I would love
> > to
> > > focus only on this for now.
> > >
> > > I think we all agree we should only keep planning/overview of the
> effort.
> > > However, currently, it's NOT the case for Airflow 2.0 and Cwiki (we
> have
> > > way too much overlapping information):
> > >
> > > 1) https://cwiki.apache.org/confluence/display/AIRFLOW/Roadmap -
> general
> > > outdated roadmap
> > > 2) https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+2.0 -
> > > High-level features (Pretty much the same as in the
> > > https://github.com/apache/airflow/issues/10085)
> > > 3)
> > >
> >
> https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+2.0+-+Planning
> > -
> > > this is I think a bit different - recently updated by Kaxil and it is
> > > mostly about general approach we should take (and overlap with the rest
> > is
> > > minimal).
> > >
> > > My proposal:
> > >
> > > 3) should stay and we should archive/deprecate 1) and 2).
> > >
> > > I also have the feeling that we must archive all the stuff that is
> > > outdated because it is really confusing now which information is
> outdated
> > > or not.
> > >
> > > My proposal:
> > >
> > > * AIPs - we keep them for now as they are (And make it part of later
> ADR
> > > discussion).
> > > * Airflow Links - archive & deprecate
> > > * Airflow Release Planning - we leave it for now and update it as part
> of
> > > 2.0 discussion
> > > * Building Docs - archive & deprecate
> > > * Releasing Airflow - I can move it to "dev" together with planned
> > backport
> > > release doc updates
> > > * Announcements - leave it for now (maybe forever)
> > > * API conventions - archive & deprecate
> > > * Committers/Commiter's Guide -> archive & deprecate  (review if some
> > > information can be move to CONTRIBUTING.rst)
> > > * Common Pitfalls -> move it to docs
> > > * Community Gudelines, contributor's Guide -> archive & deprecate
> > (review
> > > if some information can be move to CONTRIBUTING.rst)
> > > * First time contributor's workshop -> move it as blog to
> > apache.airflow.org
> > > * File lists - > move it to Airflow repo as resources.
> > > * Meeting notes -> archive & deprecate.
> > > * Meetups -> archive & deprecate.
> > > * Product requirements, Roadmap Airflow 2.0 -> archive & deprecate
> > > * Roles -> archive & deprecate (some stuff moved to CONTRIBUTING.rst)
> > > * Scheduler Basics - > move it to docs
> > > * Season of Docs 2019 -> archive & deprecate
> > >
> > > Does it sound reasonable? Does anyone think some other things should
> > stay?
> > > I am happy to do it if no-one objects.
> > >
> > > J.
> > >
> > >
> > > On Mon, Aug 3, 2020 at 3:42 PM Tomasz Urbaszek <turbaszek@apache.org>
> > wrote:
> > >
> > > > I agree with Ry. Moreover,we should adjust cwiki information after a
> > > > discussion on devlist/github. So I think in the case of AIPs the
> cwiki
> > > > should work as someting like architecture decision records.
> > > > However, I'm afraid that there will be no way to automate or enforce
> > > > this synchronization.
> > > >
> > > > Tomek
> > > >
> > > >
> > > > On Mon, Aug 3, 2020 at 3:32 PM Ry Walker <ry@rywalker.com> wrote:
> > > > >
> > > > > I'd say the cwiki should provide an overview of the effort, as it
> > does
> > > > now,
> > > > > and that we should keep track of the work in a Github project using
> > > > github
> > > > > issues. The cwiki should link to that project board as the source
> of
> > > > truth
> > > > > for project status. This will help the wiki page to be perceived
as
> > up to
> > > > > date as it won't need to be updated with each bit of progress.
> > > > >
> > > > > -Ry
> > > > >
> > > > >
> > > > > On Mon, Aug 3, 2020 at 4:29 AM Jarek Potiuk <
> > Jarek.Potiuk@polidea.com>
> > > > > wrote:
> > > > >
> > > > > > Yeah. After experimenting a bit with it - seems that Wiki in
> > Github is
> > > > > > a bit "abandoned" place and omissions like lack of auto-linking
> > issues
> > > > > > and PRs is a big bummer.
> > > > > >
> > > > > > Kamil - would you mind re-creating the issue based on the old
> > issue? I
> > > > > > - unfortunately - added all "Apache Committers" to it so we
> cannot
> > > > > > re-open it.
> > > > > >
> > > > > > But I have another question here:
> > > > > >
> > > > > > 1) Should we remove
> > > > > >
> > > >
> >
> https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+2.0+-+Planning
> > > > > > completely?
> > > > > > 2) More than that - should we archive and move everything else
> from
> > > > > > CWiki to Github Issues?
> > > > > >
> > > > > >
> > > > > > I think it will be very confusing (especially for new
> > contributors) if
> > > > > > we keep some information in CWiki but also start using Github
> > Issues
> > > > > > for similar purpose. So I would be for archiving all content
in
> the
> > > > > > CWiki and moving it all to Issues.
> > > > > >
> > > > > > I took a look at the kind of documents we have in CWiki and
we
> > have a
> > > > > > LOT of information there that is outdated or could live
> elsewhere.
> > > > > > Here are my proposals:
> > > > > >
> > > > > > * Airflow 2.0 planing - we could completely move it to "Airflow
> 2.0
> > > > > > Release" issue
> > > > > > * AIPs - we could keep all the completed AIP-s there (And keep
> the
> > > > > > "proposed" ones for the future) but we could move all the
> "active"
> > > > > > AIPs to Github Issues and add all the new AIPs there.
> > > > > > * Airflow Links - we can abandon it (It's already abandoned
in
> > fact -
> > > > > > last update May 2017)
> > > > > > * Airflow Release Planning - we could review it and turn it
into
> a
> > > > > > "meta" issue - it has a lot fo information about pre-1.10
> releases
> > > > > > which we can remove (And we will have to redefine it after we
> agree
> > > > > > release schedule and versioning for 2.* series)
> > > > > > * Building Docs - is outdated
> > > > > > * Releasing Airflow - I think we can move it to Airflow's source
> > code
> > > > > > in "dev" folder (like I did for the Backport Packages)
> > > > > > * Announcements -> that one we might do on "airflow.apache.org"
> > site
> > > > > > as a Blog post ?
> > > > > > * API conventions - outdated
> > > > > > * Committers/Commiter's Guide -> we could have it in the
> > > > > > "CONTRIBUTING.rst" documentation of Airflow (some of the
> > information
> > > > > > there is not valid anyway and CONTRIBUTING documentation is
much
> > more
> > > > > > updated)
> > > > > > * Common Pitfalls -> I think that one belongs to the
> documentation
> > of
> > > > > > Airflow not to Wiki and we could select/move some still valid
> > > > > > information from there to the documentation
> > > > > > * Community Gudelines, contributor's Guide -> this all in
> > > > CONTRIBUTING.rst
> > > > > > * First time contributor's workshop -> this can be moved
to a
> > > > > > "apache.airflow.org" as a Blog Post.
> > > > > > * File lists - > those files can be all added to the airflow
> > > > > > repository in "resources" folder or smth.
> > > > > > * Meeting notes - Those could be  added to relevant issues in
> > GitHub.
> > > > > > We could have "meta" issues for "special interest groups" and
add
> > > > > > meeting notes there.
> > > > > > * Meetups -> already part of airflow.apache.org
> > > > > > * Product requirements, Roadmap Airflow 2.0 -> this all could
be
> > moved
> > > > > > to "meta" issues
> > > > > > * Roles -> should be added to CONTRIBUTING.rst
> > > > > > * Scheduler Basics - > should be part of Airflow Documentation
> > > > > >  * Season of Docs 2019 -> we can archive it.
> > > > > >
> > > > > > We could also use Github Wiki to only have "Index" of all
> important
> > > > > > issues that are "permanent" - Airflow 2.0 roadmap, Special
> interest
> > > > > > groups, AIPs,
> > > > > >
> > > > > > Let me know what you think?
> > > > > >
> > > > > > J.
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Sun, Aug 2, 2020 at 12:31 PM Kaxil Naik <kaxilnaik@gmail.com>
> > > > wrote:
> > > > > > >
> > > > > > > I agree with Tomek and feel Github issues ("meta"-issue)
is a
> > better
> > > > > > place
> > > > > > > than Github Wiki.
> > > > > > >
> > > > > > > On Sun, Aug 2, 2020 at 11:26 AM Tomasz Urbaszek <
> > > > turbaszek@apache.org>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > I see the advantage of having no comment in wiki but
in the
> > longer
> > > > > > > > run, I think this will create confusion. Where should
I
> > discuss a
> > > > > > > > particular thing? On devlist? Slack? In issue? How
should a
> new
> > > > > > > > contributor know this?
> > > > > > > >
> > > > > > > > After giving some thought to that I'm leaning towards
the
> > > > meta-issue:
> > > > > > > > - they are clear (no need to go to wiki)
> > > > > > > > - give possibilit to link other issues/PRs that shows
their
> > > > content on
> > > > > > > > hover
> > > > > > > > - this is great advantage as we can see how our work
is
> > > > interconnected
> > > > > > > > - having an issue make it explicit to where contributors
> should
> > > > leave
> > > > > > > > their comments
> > > > > > > >
> > > > > > > > No matter what we decide, we should thrive to limit
the
> places
> > > > where
> > > > > > > > information is available.
> > > > > > > >
> > > > > > > > Bests,
> > > > > > > > Tomek
> > > > > > > >
> > > > > > > >
> > > > > > > > On Sun, Aug 2, 2020 at 12:00 PM Jarek Potiuk <
> > > > Jarek.Potiuk@polidea.com
> > > > > > >
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > Question. Should we move over Airflow 2.0 Status
and other
> > > > > > "permanent"
> > > > > > > > > information to Github Wiki? See here for example:
> > > > > > > > > https://github.com/apache/airflow/wiki/Airflow-2.0
> > > > > > > > >
> > > > > > > > > The discussion originated by Kamil creating an
issue for
> > Airflow
> > > > 2.0
> > > > > > -
> > > > > > > > > which was essentially overriding the page we
had in
> > > > > > > > >
> > > > > > > >
> > > > > >
> > > >
> >
> https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+2.0+-+Planning
> > > > > > > > > and adding more "status" information in
> > > > > > > > > https://github.com/apache/airflow/issues/10085.
This was
> > more
> > > > of a
> > > > > > > > > "meta" issue as it has a lot of unrelated issues
/ projects
> > > > mentioned
> > > > > > > > > - the only common thing for those was that it
was "Airflow
> > 2.0".
> > > > But
> > > > > > > > > we already have "Milestone 2.0" and CWIKI page.
> > > > > > > > >
> > > > > > > > > My proposal was that since we have 2.0 Milestone
already we
> > > > should
> > > > > > use
> > > > > > > > > this one to mark issues for 2.0 and in order
to keep
> > > > > > > > > Roadmap/Plans/Status we can use Github's Wiki
instead. IMHO
> > it is
> > > > > > much
> > > > > > > > > better as it does not allow comments - which
is good IMHO.
> > For
> > > > this
> > > > > > > > > jind of "permanent" pages, comments and discussion
should
> > happen
> > > > for
> > > > > > > > > the individual issues not for the page itself
 (especially
> > when
> > > > you
> > > > > > do
> > > > > > > > > not have in-line comments).
> > > > > > > > >
> > > > > > > > > And this page should always be "current" - with
the old
> > roadmap
> > > > in
> > > > > > > > > CWIKI and the issue 10085 when you add comments,
you
> quickly
> > lose
> > > > > > > > > track whether the comments are more important
than the
> > overview,
> > > > and
> > > > > > > > > how accurate the "overview" is.  When you just
edit the
> wiki
> > -
> > > > you
> > > > > > > > > always do it deliberately - because you want
to update
> status
> > > > rather
> > > > > > > > > than make a comment or discuss,
> > > > > > > > >
> > > > > > > > > So I created this as copy of the issue:
> > > > > > > > > https://github.com/apache/airflow/wiki/Airflow-2.0
so that
> > we
> > > > can
> > > > > > > > > compare it - can you please compare it with
> > > > > > > > > https://github.com/apache/airflow/issues/10085
and voice
> > your
> > > > > > opinion
> > > > > > > > > what's better?
> > > > > > > > >
> > > > > > > > > I think it's also a great opportunity to archive
a lot of
> the
> > > > old and
> > > > > > > > > not up-to-date from the old Wiki and migrate
it to GitHub.
> We
> > > > could
> > > > > > > > > move AIPs to Github issues (as needed) - AIPS
are fine for
> > > > > > > > > discussion/issues/comments, but when they got
implemented
> we
> > > > could
> > > > > > > > > move it over to wiki as "Implemented" status
for history.
> > > > > > > > >
> > > > > > > > > Let me know what you think.
> > > > > > > > >
> > > > > > > > > BTW. PLEASE do NOT comment on that #10085 issue
(it's now
> > locked
> > > > and
> > > > > > > > > closed). I accidentally (shame on me) notified
all Apache
> > > > Committers.
> > > > > > > > > Happened twice today (also for someone else)
so I opened a
> > > > ticket to
> > > > > > > > > Infra to restrict that (If only possible) because
it's all
> > too
> > > > easy
> > > > > > to
> > > > > > > > > notify everyone @Apache). If you comment there
3K+ people
> get
> > > > > > > > > notified.
> > > > > > > > >
> > > > > > > > > But feel free to upvote the infra ticket:
> > > > > > > > > https://issues.apache.org/jira/browse/INFRA-20623
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > J.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Jarek Potiuk
> > > > > > > > > Polidea | Principal Software Engineer
> > > > > > > > >
> > > > > > > > > M: +48 660 796 129
> > > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > >
> > > > > > Jarek Potiuk
> > > > > > Polidea | Principal Software Engineer
> > > > > >
> > > > > > M: +48 660 796 129
> > > > > >
> > > >
> > >
> > >
> > > --
> > >
> > > Jarek Potiuk
> > > Polidea <https://www.polidea.com/> | Principal Software Engineer
> > >
> > > M: +48 660 796 129 <+48660796129>
> > > [image: Polidea] <https://www.polidea.com/>
> >
>
>
> --
>
> Jarek Potiuk
> Polidea <https://www.polidea.com/> | Principal Software Engineer
>
> M: +48 660 796 129 <+48660796129>
> [image: Polidea] <https://www.polidea.com/>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message