metron-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Simon Elliston Ball <si...@simonellistonball.com>
Subject Re: [DISCUSS] Field conversions
Date Tue, 05 Jun 2018 16:31:23 GMT
Yes, anything using elastic would need the field names changed. That said, people who are on
such an old version (eol) will need to not the bullet with ES compatibility as some point.

Simon 

> On 5 Jun 2018, at 17:17, Otto Fowler <ottobackwards@gmail.com> wrote:
> 
> Are there consequences with Kibana as well?  queries, visualizations,
> templates they may have?
> 
> 
> On June 5, 2018 at 12:03:44, Nick Allen (nick@nickallen.org) wrote:
> 
> I just don't know if telling users to do a bulk upgrade of their indices is
> sufficient enough of an upgrade path. I would expect some to have
> downstream processes dependent on those field names, which would also need
> to be updated.
> 
> Although, we could tell users to do any field name conversions that they
> depend on using parser transformations; rather than the
> `FieldNameConverter` abstractions. I *think* that would be a valid upgrade
> path where we could just revert #1022.
> 
>> On Tue, Jun 5, 2018 at 10:34 AM, Nick Allen <nick@nickallen.org> wrote:
>> 
>> I am in favor of removing the `FieldNameConverter` abstraction as an end
>> state. Although, I don't agree with Simon that we could have just done
>> that directly without providing a backwards compatible solution as was
> done
>> in #1022. There are too many touch points that rely on that conversion
> and
>> users who expect fields to land in their indices named a certain way (no
>> matter what version of ES they are running). If I am wrong and there is a
>> better approach that works, then we should just revert #1022.
>> 
>> On Tue, Jun 5, 2018 at 9:37 AM, Simon Elliston Ball <
>> simon@simonellistonball.com> wrote:
>> 
>>> I would definitely agree that the transformation should be removed. We
>>> have
>>> now however added a complex generic solution in the backend, which is
>>> going
>>> to be noop for most people. This was done I believe for the sake of
>>> backward compatibility. I would argue however, that there is no need to
>>> support ES 2.3, and therefore no need to support de-dotting
>>> transformations. This does seem somewhat over-engineered to me, though
> it
>>> does save people re-indexing on upgrades. I suspect in reality that this
>>> is
>>> a rare edge case, and that we would do far better to settle on one
>>> solution
>>> (the dotted version, not the colons, to my mind)
>>> 
>>> Simon
>>> 
>>>> On 5 June 2018 at 06:29, Ryan Merriman <merrimanr@gmail.com> wrote:
>>>> 
>>>> I agree completely. I will leave this thread open for a day or two to
>>> give
>>>> others a chance to weigh in. If no one opposes, I will creates Jiras
>>> for
>>>> removing field transformations and transforming existing data.
>>>> 
>>>> On Tue, Jun 5, 2018 at 8:21 AM, Casey Stella <cestella@gmail.com>
>>> wrote:
>>>> 
>>>>> Well, on write it is a transformation, on read it's a translation.
>>> This
>>>> is
>>>>> to say that you're providing a mapping on read to translate field
>>> names
>>>>> given the index you're using. The other approach that I was
>>> considering
>>>>> last night is a field transformation REST call which translates
> field
>>>> names
>>>>> that the UI could call. So, the UI would pass 'source.type' to the
>>> field
>>>>> translation service and in Solr it'd return source.type and in ES
> it'd
>>>>> return source:type. Underneath the hood the service would use the
>>> same
>>>>> transformation as the writer uses. That's another way to skin this
>>> cat.
>>>>> 
>>>>> Ultimately, I think we should just ditch this field transformation
>>>>> business, as Laurens said, as long as we have a utility to transform
>>>>> existing data.
>>>>> 
>>>>> On Tue, Jun 5, 2018 at 8:54 AM Ryan Merriman <merrimanr@gmail.com>
>>>> wrote:
>>>>> 
>>>>>> Having 2 different patterns for configuring field name
>>> transformations
>>>> on
>>>>>> read vs write is confusing to me. I agree with both of you that
>>>>>> normalizing on '.' and not having to do the translation at all
>>> would be
>>>>>> ideal. Like you both suggested, we would need some utility or
>>> script
>>>> to
>>>>>> convert preexisting data to match this format. There could also be
>>>> some
>>>>>> adjustments a user would need to make in the UI but I feel like we
>>>> could
>>>>>> document around that. Are there any objections to doing it this
>>> way?
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Mon, Jun 4, 2018 at 4:30 PM, Laurens Vets <laurens@daemon.be>
>>>> wrote:
>>>>>> 
>>>>>>> ES 2.x support officially ended 4 months ago (
>>>>>>> https://www.elastic.co/support/eol), so why still support ':'
at
>>>> all?
>>>>> :)
>>>>>>> Additionally, 2.x isn't even supported at all on the last 2
> Ubuntu
>>>> LTS
>>>>>>> releases (16.04 & 18.05).
>>>>>>> 
>>>>>>> Therefor, move everything to use '.' and provide a
>>> conversion/upgrade
>>>>>>> script to change '.' to ':'?
>>>>>>> 
>>>>>>> 
>>>>>>>> On 2018-06-04 13:55, Ryan Merriman wrote:
>>>>>>>> 
>>>>>>>> We've been dealing with a reoccurring challenge in Metron.
It
> is
>>>>> common
>>>>>>>> for various fields to contain '.' characters for the purpose
of
>>>> making
>>>>>>>> them
>>>>>>>> more readable, namespacing, etc. At one point we only supported
>>>>>>>> Elasticsearch 2.3 which did not allow dots and forced us
to use
>>> ':'
>>>>>>>> instead. This limitation does not exist in later versions
of
>>>>>>>> Elasticsearch
>>>>>>>> or Solr.
>>>>>>>> 
>>>>>>>> Now we're in a situation where we need to allow a user to
use
>>> either
>>>>> one
>>>>>>>> because they may still be using ES 2.3 or have data with
':'
>>>>> characters
>>>>>> in
>>>>>>>> field names. We've attempted to make this configurable in
a
>>> couple
>>>>>>>> different PRs:
>>>>>>>> 
>>>>>>>> https://github.com/apache/metron/pull/1022
>>>>>>>> https://github.com/apache/metron/pull/1010
>>>>>>>> https://github.com/apache/metron/pull/1038
>>>>>>>> 
>>>>>>>> The approaches taken in these are not consistent and fall
short
>>> in
>>>>>>>> different ways. The first (METRON-1569 Allow user to change
>>> field
>>>>> name
>>>>>>>> conversion when indexing) only applies to indexing and not
>>> querying.
>>>>>> The
>>>>>>>> others only apply to a single field which does not scale
well.
>>> Now
>>>> we
>>>>>>>> have
>>>>>>>> an issue with another field in
>>>>>>>> https://issues.apache.org/jira/browse/METRON-1600. Rather
than
>>>>>>>> continuing
>>>>>>>> with a patchwork of different fixes I want to attempt to
design
> a
>>>>>>>> system-wide solution.
>>>>>>>> 
>>>>>>>> My first thought is to expand
>>>>>> https://github.com/apache/metron/pull/1022
>>>>>>>> to
>>>>>>>> apply globally. However this is not trivial and would require
>>>>>> significant
>>>>>>>> changes. It would also make https://github.com/apache/
>>>>> metron/pull/1010
>>>>>>>> obsolete and we might end up having to revert all of it.
>>>>>>>> 
>>>>>>>> Does anyone have any ideas or opinions? I am still researching
>>>>>> solutions
>>>>>>>> but would love some guidance from the community.
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>>> 
>>> 
>>> --
>>> --
>>> simon elliston ball
>>> @sireb
>>> 
>> 
>> 

Mime
View raw message