bloodhound-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gary <gary.mar...@physics.org>
Subject Re: [Proposal] Core Bloodhound - basic concepts
Date Mon, 12 Mar 2018 12:56:57 GMT
On Mon, 12 Mar 2018, at 2:00 AM, Greg Stein wrote:
> On Sun, Mar 11, 2018 at 4:21 PM, Gary <gary.martin@physics.org> wrote:
> >...
> 
> > How tickets are stored as a whole is also worth tackling, which is what #2
> > is trying to broadly decide. Here I am suggesting that the state of a
> > ticket can be built up from a query that collects all the change events and
> > applies the deltas in order. This could prove to be a slow process so at
> > some point we may want to look at keeping a record of the current state or
> > a checkpoint but, given that ticket views are expected to show the history,
> > these are details that are required anyway. I suggest that we can look at
> > optimising for speed later. Knowing that we can update a ticket from deltas
> > may be useful.
> >
> 
> The Apache Subversion team has a LOT of experience in this kind of storage
> :-)
> 
> When we started, we stored the "latest" in full text, and then had a series
> of "reverse deltas" to go backwards in time. We switched that around, and
> store the original in full text, and then apply "forward" deltas to reach
> "latest".
> 
> We have two mechanisms to speed this up: a sophisticated cache system.
> Invariably, "latest" will be cached and fast to retrieve. The more
> important part of our "series of delta" system, allowing us to rapidly
> construct any point in history is to use "skip deltas". This is analogous
> to the "skip list"[1] concept, where we assemble N deltas into a single
> delta allowing us to skip over many deltas with a single application.
> Generally speaking, if there are M deltas in a file's entire history, then
> we can reassemble any point in history by applying log(M) deltas.
> 
> (note we used the skip delta mechanism for both directions; it is effective
> in both directions)

Well, it is good to confirm that this aspect of the proposal is on relatively solid footing.
Given that retention of ticket history has to be there, I thought it would be better to go
with a solution that avoided modifying existing records. Beyond that, you have translated
what I was saying pretty well! I kind of see the cache of the latest state of a ticket as
being something that does not require being kept constantly up to date given the ability to
tell when it is behind. Anyway, these are implementation details that do not need to be set
in stone yet.

Regardless, I will of course welcome contributions or suggestions from those with experience
of this!

> Finally, I am suggesting in #3 that as much as possible we generalise
> > ticket categorisation. Categorisation is a central concept to capture and,
> > in trac we inherited categorisations such as statuses (open, in progress,
> > etc), types (bug, enhancement, task, etc), milestones, versions, etc, and
> > these had separate implementations.
> 
> 
> When my team designed the issue tracker for Google Code's project hosting,
> we did the same thing. Most issue trackers are highly-structured with a
> field or this and that. It complicates issue creation, issue management,
> querying, etc. I recall sitting down with 20 fields from a typical Bugzilla
> install, and reducing it to about 8, where we used labels for "everything".
> We did not bother with "issue types".
> 
> A second thing that we did is allow labels such as "milestone-14", and then
> enable our "list display" to be configured for a "milestone" column that
> would extract any labels that started with "milestone-", showing just the
> "14".
> 
> It made for a very robust, easy to understand, and flexible metadata system
> for each issue.
> 
> You can still see some of these design choices a decade later in Monorail,
> a descendent of our tracker, now being used by the Chromium project.

I like this :)

The main objectives around labelling (classification, characterisation, whatever) as I described
them was to have a flexible model where labels can be shared across trackers or be localised
to a specific tracker and admins can setup label types that they feel are appropriate.

We could consider peeling this way back to basics so that you just label things similarly
to how Greg describes. This can simplify core logic at the expense of being able to formalise
behavioural differences between kinds of labels (like whether a ticket can support more than
one of the same type of label at a time.)

We could look at allowing these constraints to be added on outside of core functionality or
look at adding it into the core at a later point.

Given lightweight labels, whether a label is shared across trackers or is only available in
a specific tracker may be less important.

Certainly food for thought. I would be up for starting from the simplified model and seeing
how far that gets us in meeting the core ticketing needs.

> 
> >...
> 
> Cheers,
> -g
> 
> [1] https://en.wikipedia.org/wiki/Skip_list

Thanks for the input!

Cheers,
    Gary

Mime
View raw message