metron-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick Allen <n...@nickallen.org>
Subject Re: [DISCUSS] Field conversions
Date Tue, 05 Jun 2018 14:34:00 GMT
I am in favor of removing the `FieldNameConverter` abstraction as an end
state.  Although, I don't agree with Simon that we could have just done
that directly without providing a backwards compatible solution as was done
in #1022.  There are too many touch points that rely on that conversion and
users who expect fields to land in their indices named a certain way (no
matter what version of ES they are running).  If I am wrong and there is a
better approach that works, then we should just revert #1022.

On Tue, Jun 5, 2018 at 9:37 AM, Simon Elliston Ball <
simon@simonellistonball.com> wrote:

> I would definitely agree that the transformation should be removed. We have
> now however added a complex generic solution in the backend, which is going
> to be noop for most people. This was done I believe for the sake of
> backward compatibility. I would argue however, that there is no need to
> support ES 2.3, and therefore no need to support de-dotting
> transformations. This does seem somewhat over-engineered to me, though it
> does save people re-indexing on upgrades. I suspect in reality that this is
> a rare edge case, and that we would do far better to settle on one solution
> (the dotted version, not the colons, to my mind)
>
> Simon
>
> On 5 June 2018 at 06:29, Ryan Merriman <merrimanr@gmail.com> wrote:
>
> > I agree completely.  I will leave this thread open for a day or two to
> give
> > others a chance to weigh in.  If no one opposes, I will creates Jiras for
> > removing field transformations and transforming existing data.
> >
> > On Tue, Jun 5, 2018 at 8:21 AM, Casey Stella <cestella@gmail.com> wrote:
> >
> > > Well, on write it is a transformation, on read it's a translation.
> This
> > is
> > > to say that you're providing a mapping on read to translate field names
> > > given the index you're using.  The other approach that I was
> considering
> > > last night is a field transformation REST call which translates field
> > names
> > > that the UI could call.  So, the UI would pass 'source.type' to the
> field
> > > translation service and in Solr it'd return source.type and in ES it'd
> > > return source:type.  Underneath the hood the service would use the same
> > > transformation as the writer uses.  That's another way to skin this
> cat.
> > >
> > > Ultimately, I think we should just ditch this field transformation
> > > business, as Laurens said, as long as we have a utility to transform
> > > existing data.
> > >
> > > On Tue, Jun 5, 2018 at 8:54 AM Ryan Merriman <merrimanr@gmail.com>
> > wrote:
> > >
> > > > Having 2 different patterns for configuring field name
> transformations
> > on
> > > > read vs write is confusing to me.  I agree with both of you that
> > > > normalizing on '.' and not having to do the translation at all would
> be
> > > > ideal.  Like you both suggested, we would need some utility or script
> > to
> > > > convert preexisting data to match this format.  There could also be
> > some
> > > > adjustments a user would need to make in the UI but I feel like we
> > could
> > > > document around that.  Are there any objections to doing it this way?
> > > >
> > > >
> > > >
> > > > On Mon, Jun 4, 2018 at 4:30 PM, Laurens Vets <laurens@daemon.be>
> > wrote:
> > > >
> > > > > ES 2.x support officially ended 4 months ago (
> > > > > https://www.elastic.co/support/eol), so why still support ':' at
> > all?
> > > :)
> > > > > Additionally, 2.x isn't even supported at all on the last 2 Ubuntu
> > LTS
> > > > > releases (16.04 & 18.05).
> > > > >
> > > > > Therefor, move everything to use '.' and provide a
> conversion/upgrade
> > > > > script to change '.' to ':'?
> > > > >
> > > > >
> > > > > On 2018-06-04 13:55, Ryan Merriman wrote:
> > > > >
> > > > >> We've been dealing with a reoccurring challenge in Metron.  It
is
> > > common
> > > > >> for various fields to contain '.' characters for the purpose
of
> > making
> > > > >> them
> > > > >> more readable, namespacing, etc.  At one point we only supported
> > > > >> Elasticsearch 2.3 which did not allow dots and forced us to use
> ':'
> > > > >> instead.  This limitation does not exist in later versions of
> > > > >> Elasticsearch
> > > > >> or Solr.
> > > > >>
> > > > >> Now we're in a situation where we need to allow a user to use
> either
> > > one
> > > > >> because they may still be using ES 2.3 or have data with ':'
> > > characters
> > > > in
> > > > >> field names.  We've attempted to make this configurable in a
> couple
> > > > >> different PRs:
> > > > >>
> > > > >> https://github.com/apache/metron/pull/1022
> > > > >> https://github.com/apache/metron/pull/1010
> > > > >> https://github.com/apache/metron/pull/1038
> > > > >>
> > > > >> The approaches taken in these are not consistent and fall short
in
> > > > >> different ways.  The first (METRON-1569 Allow user to change
field
> > > name
> > > > >> conversion when indexing) only applies to indexing and not
> querying.
> > > > The
> > > > >> others only apply to a single field which does not scale well.
> Now
> > we
> > > > >> have
> > > > >> an issue with another field in
> > > > >> https://issues.apache.org/jira/browse/METRON-1600.  Rather than
> > > > >> continuing
> > > > >> with a patchwork of different fixes I want to attempt to design
a
> > > > >> system-wide solution.
> > > > >>
> > > > >> My first thought is to expand
> > > > https://github.com/apache/metron/pull/1022
> > > > >> to
> > > > >> apply globally.  However this is not trivial and would require
> > > > significant
> > > > >> changes.  It would also make https://github.com/apache/
> > > metron/pull/1010
> > > > >> obsolete and we might end up having to revert all of it.
> > > > >>
> > > > >> Does anyone have any ideas or opinions?  I am still researching
> > > > solutions
> > > > >> but would love some guidance from the community.
> > > > >>
> > > > >
> > > >
> > >
> >
>
>
>
> --
> --
> simon elliston ball
> @sireb
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message