metron-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Miklavcic <michael.miklav...@gmail.com>
Subject Re: [DISCUSS] Field conversions
Date Wed, 06 Jun 2018 19:51:33 GMT
Yeah Otto, pre-0.5.0 (0.4.2) would be ES 2.3 if users were not using
master. ES upgrade is a big piece of this Apache release. Last release was
0.4.2 on Fri Dec 22 2017.

I'm +1 on the idea of an example referencing the ES docs and keeping this
as simple as possible.

* https://archive.apache.org/dist/metron/

On Tue, Jun 5, 2018 at 11:06 AM, Otto Fowler <ottobackwards@gmail.com>
wrote:

> Aren’t people who are on an old version of ES everyone pre 0.5.0?  Like all
> the metron users?
>
>
> On June 5, 2018 at 12:31:30, Simon Elliston Ball (
> simon@simonellistonball.com) wrote:
>
> Yes, anything using elastic would need the field names changed. That said,
> people who are on such an old version (eol) will need to not the bullet
> with ES compatibility as some point.
>
> Simon
>
> > On 5 Jun 2018, at 17:17, Otto Fowler <ottobackwards@gmail.com> wrote:
> >
> > Are there consequences with Kibana as well? queries, visualizations,
> > templates they may have?
> >
> >
> > On June 5, 2018 at 12:03:44, Nick Allen (nick@nickallen.org) wrote:
> >
> > I just don't know if telling users to do a bulk upgrade of their indices
> is
> > sufficient enough of an upgrade path. I would expect some to have
> > downstream processes dependent on those field names, which would also
> need
> > to be updated.
> >
> > Although, we could tell users to do any field name conversions that they
> > depend on using parser transformations; rather than the
> > `FieldNameConverter` abstractions. I *think* that would be a valid
> upgrade
> > path where we could just revert #1022.
> >
> >> On Tue, Jun 5, 2018 at 10:34 AM, Nick Allen <nick@nickallen.org> wrote:
> >>
> >> I am in favor of removing the `FieldNameConverter` abstraction as an end
> >> state. Although, I don't agree with Simon that we could have just done
> >> that directly without providing a backwards compatible solution as was
> > done
> >> in #1022. There are too many touch points that rely on that conversion
> > and
> >> users who expect fields to land in their indices named a certain way (no
> >> matter what version of ES they are running). If I am wrong and there is
> a
> >> better approach that works, then we should just revert #1022.
> >>
> >> On Tue, Jun 5, 2018 at 9:37 AM, Simon Elliston Ball <
> >> simon@simonellistonball.com> wrote:
> >>
> >>> I would definitely agree that the transformation should be removed. We
> >>> have
> >>> now however added a complex generic solution in the backend, which is
> >>> going
> >>> to be noop for most people. This was done I believe for the sake of
> >>> backward compatibility. I would argue however, that there is no need to
> >>> support ES 2.3, and therefore no need to support de-dotting
> >>> transformations. This does seem somewhat over-engineered to me, though
> > it
> >>> does save people re-indexing on upgrades. I suspect in reality that
> this
> >>> is
> >>> a rare edge case, and that we would do far better to settle on one
> >>> solution
> >>> (the dotted version, not the colons, to my mind)
> >>>
> >>> Simon
> >>>
> >>>> On 5 June 2018 at 06:29, Ryan Merriman <merrimanr@gmail.com> wrote:
> >>>>
> >>>> I agree completely. I will leave this thread open for a day or two to
> >>> give
> >>>> others a chance to weigh in. If no one opposes, I will creates Jiras
> >>> for
> >>>> removing field transformations and transforming existing data.
> >>>>
> >>>> On Tue, Jun 5, 2018 at 8:21 AM, Casey Stella <cestella@gmail.com>
> >>> wrote:
> >>>>
> >>>>> Well, on write it is a transformation, on read it's a translation.
> >>> This
> >>>> is
> >>>>> to say that you're providing a mapping on read to translate field
> >>> names
> >>>>> given the index you're using. The other approach that I was
> >>> considering
> >>>>> last night is a field transformation REST call which translates
> > field
> >>>> names
> >>>>> that the UI could call. So, the UI would pass 'source.type' to the
> >>> field
> >>>>> translation service and in Solr it'd return source.type and in ES
> > it'd
> >>>>> return source:type. Underneath the hood the service would use the
> >>> same
> >>>>> transformation as the writer uses. That's another way to skin this
> >>> cat.
> >>>>>
> >>>>> Ultimately, I think we should just ditch this field transformation
> >>>>> business, as Laurens said, as long as we have a utility to transform
> >>>>> existing data.
> >>>>>
> >>>>> On Tue, Jun 5, 2018 at 8:54 AM Ryan Merriman <merrimanr@gmail.com>
> >>>> wrote:
> >>>>>
> >>>>>> Having 2 different patterns for configuring field name
> >>> transformations
> >>>> on
> >>>>>> read vs write is confusing to me. I agree with both of you that
> >>>>>> normalizing on '.' and not having to do the translation at all
> >>> would be
> >>>>>> ideal. Like you both suggested, we would need some utility or
> >>> script
> >>>> to
> >>>>>> convert preexisting data to match this format. There could also
be
> >>>> some
> >>>>>> adjustments a user would need to make in the UI but I feel like
we
> >>>> could
> >>>>>> document around that. Are there any objections to doing it this
> >>> way?
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On Mon, Jun 4, 2018 at 4:30 PM, Laurens Vets <laurens@daemon.be>
> >>>> wrote:
> >>>>>>
> >>>>>>> ES 2.x support officially ended 4 months ago (
> >>>>>>> https://www.elastic.co/support/eol), so why still support
':' at
> >>>> all?
> >>>>> :)
> >>>>>>> Additionally, 2.x isn't even supported at all on the last
2
> > Ubuntu
> >>>> LTS
> >>>>>>> releases (16.04 & 18.05).
> >>>>>>>
> >>>>>>> Therefor, move everything to use '.' and provide a
> >>> conversion/upgrade
> >>>>>>> script to change '.' to ':'?
> >>>>>>>
> >>>>>>>
> >>>>>>>> On 2018-06-04 13:55, Ryan Merriman wrote:
> >>>>>>>>
> >>>>>>>> We've been dealing with a reoccurring challenge in Metron.
It
> > is
> >>>>> common
> >>>>>>>> for various fields to contain '.' characters for the
purpose of
> >>>> making
> >>>>>>>> them
> >>>>>>>> more readable, namespacing, etc. At one point we only
supported
> >>>>>>>> Elasticsearch 2.3 which did not allow dots and forced
us to use
> >>> ':'
> >>>>>>>> instead. This limitation does not exist in later versions
of
> >>>>>>>> Elasticsearch
> >>>>>>>> or Solr.
> >>>>>>>>
> >>>>>>>> Now we're in a situation where we need to allow a user
to use
> >>> either
> >>>>> one
> >>>>>>>> because they may still be using ES 2.3 or have data
with ':'
> >>>>> characters
> >>>>>> in
> >>>>>>>> field names. We've attempted to make this configurable
in a
> >>> couple
> >>>>>>>> different PRs:
> >>>>>>>>
> >>>>>>>> https://github.com/apache/metron/pull/1022
> >>>>>>>> https://github.com/apache/metron/pull/1010
> >>>>>>>> https://github.com/apache/metron/pull/1038
> >>>>>>>>
> >>>>>>>> The approaches taken in these are not consistent and
fall short
> >>> in
> >>>>>>>> different ways. The first (METRON-1569 Allow user to
change
> >>> field
> >>>>> name
> >>>>>>>> conversion when indexing) only applies to indexing and
not
> >>> querying.
> >>>>>> The
> >>>>>>>> others only apply to a single field which does not scale
well.
> >>> Now
> >>>> we
> >>>>>>>> have
> >>>>>>>> an issue with another field in
> >>>>>>>> https://issues.apache.org/jira/browse/METRON-1600. Rather
than
> >>>>>>>> continuing
> >>>>>>>> with a patchwork of different fixes I want to attempt
to design
> > a
> >>>>>>>> system-wide solution.
> >>>>>>>>
> >>>>>>>> My first thought is to expand
> >>>>>> https://github.com/apache/metron/pull/1022
> >>>>>>>> to
> >>>>>>>> apply globally. However this is not trivial and would
require
> >>>>>> significant
> >>>>>>>> changes. It would also make https://github.com/apache/
> >>>>> metron/pull/1010
> >>>>>>>> obsolete and we might end up having to revert all of
it.
> >>>>>>>>
> >>>>>>>> Does anyone have any ideas or opinions? I am still researching
> >>>>>> solutions
> >>>>>>>> but would love some guidance from the community.
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> --
> >>> simon elliston ball
> >>> @sireb
> >>>
> >>
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message