metron-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matt Foley <mfo...@hortonworks.com>
Subject Re: [DISCUSS] Upgrading Elasticsearch from 2.x to 5.x
Date Thu, 12 Oct 2017 16:50:26 GMT
Github also allows file attachments, via comments on PRs.  Not necessarily intuitive for a
“Discussion”.  There would be more sensible places to put files for discussion in github,
such as “wiki” or “issues”, but those aren’t enabled on apache projects in github.

On 10/11/17, 10:39 PM, "Michael Miklavcic" <michael.miklavcic@gmail.com> wrote:

    We've generally preferred communication workflows via Github and the
    mailing list rather than Jira for most things on this project, but you're
    right that we could probably leverage it for sharing attachments to the dev
    list.
    
    On Wed, Oct 11, 2017 at 9:54 PM, Matt Foley <mfoley@hortonworks.com> wrote:
    
    > You can avoid the permission issues by attaching it to an Apache jira.
    >
    > On 10/11/17, 6:10 PM, "James Sirota" <jsirota@apache.org> wrote:
    >
    >     I can't see it.  You probably want to link to a google drive
    >
    >     11.10.2017, 18:01, "Michael Miklavcic" <michael.miklavcic@gmail.com>:
    >     > I attached a PDF - shows up on my end. Is that not coming through?
    >     >
    >     > On Wed, Oct 11, 2017 at 6:42 PM, Otto Fowler <
    > ottobackwards@gmail.com>
    >     > wrote:
    >     >
    >     >>  I think there is a missing attachment?
    >     >>
    >     >>  On October 11, 2017 at 20:22:33, Michael Miklavcic (
    >     >>  michael.miklavcic@gmail.com) wrote:
    >     >>
    >     >>  For community reference, here is a class diagram that depicts our
    > current
    >     >>  Metron 0.4.1 dependencies, for both prod and test code, against
    > the old ES
    >     >>  client APIs along with an "after" diagram showing the world with
    > the new
    >     >>  client. Feedback welcome.
    >     >>
    >     >>  On Fri, Oct 6, 2017 at 8:13 AM, Casey Stella <cestella@gmail.com>
    > wrote:
    >     >>
    >     >>>  Yeah, I agree with what Michael "fine whine" Miklavcic said; I'm
    > in favor
    >     >>>  of the high level client.
    >     >>>
    >     >>>  On Thu, Oct 5, 2017 at 3:35 PM, Michael Miklavcic <
    >     >>>  michael.miklavcic@gmail.com> wrote:
    >     >>>
    >     >>>  > Justin, thanks for the feedback! I'm inclined to agree with
you
    > about
    >     >>>  using
    >     >>>  > the high level client. It's a bummer that we still need to
do
    > jar
    >     >>>  shading,
    >     >>>  > but I think that's a reasonable short term sacrifice
    > considering the
    >     >>>  other
    >     >>>  > benefits. And they're angling towards slowly removing the
ES
    > core dep
    >     >>>  over
    >     >>>  > time anyhow so, like myself, this will get better with age.
    >     >>>  >
    >     >>>  > On Thu, Oct 5, 2017 at 12:40 PM, Justin Leet <
    > justinjleet@gmail.com>
    >     >>>  > wrote:
    >     >>>  >
    >     >>>  > > Do we intend on (or have interest in) supporting ES across
    > major
    >     >>>  version
    >     >>>  > > for a given version of Metron? I'm not convinced it's
worth
    > the work
    >     >>>  of
    >     >>>  > > using the low level client.
    >     >>>  > >
    >     >>>  > > This really only seems useful for ES clusters that are
being
    > used
    >     >>>  outside
    >     >>>  > > Metron and need to be on a different ES major version.
Is
    > that a use
    >     >>>  case
    >     >>>  > > we want/need to support? I'm willing to bet it's
    > significantly more
    >     >>>  work
    >     >>>  > > and means we're modifying queries and even templates/mappings
    > based on
    >     >>>  > what
    >     >>>  > > ES version we're interacting with (e.g. meta alerts in
5.x can
    >     >>>  exploit a
    >     >>>  > > query param to not screw around with the mapping, but
that
    > param
    >     >>>  doesn't
    >     >>>  > > exist in 2.x). At that point, we're either back to writing
    > for ES 2.x
    >     >>>  or
    >     >>>  > > writing for every version of ES.
    >     >>>  > >
    >     >>>  > > Unless that's something we have a demand for (or someone
else
    >     >>>  persuades
    >     >>>  > me
    >     >>>  > > otherwise), I'm in favor of using the high level client.
It
    > seems
    >     >>>  like
    >     >>>  > > it'd be easier to migrate to also, given the similarities
    > API-wise to
    >     >>>  the
    >     >>>  > > current client we're using.
    >     >>>  > >
    >     >>>  > > On Thu, Oct 5, 2017 at 1:52 PM, Michael Miklavcic <
    >     >>>  > > michael.miklavcic@gmail.com> wrote:
    >     >>>  > >
    >     >>>  > > > I think it might help the discussion to share my
    > impressions of
    >     >>>  looking
    >     >>>  > > > over the new API recommendations from ES. I've summarized
    > some info
    >     >>>  > > > provided by ES back in December 2016 regarding the
reasons
    > for
    >     >>>  > switching
    >     >>>  > > to
    >     >>>  > > > a new client model. [1]
    >     >>>  > > >
    >     >>>  > > > *Summary points:*
    >     >>>  > > >
    >     >>>  > > > Pre-5.x had Java API - binary exchange format used
for
    > node-to-node
    >     >>>  > > > communications.
    >     >>>  > > > In 5.x a low level REST API was added. Now there's
also a
    > high level
    >     >>>  > REST
    >     >>>  > > > client that handles request marshalling and response
    > un-marshalling.
    >     >>>  > > >
    >     >>>  > > > *Benefits of existing Java API*
    >     >>>  > > >
    >     >>>  > > > 1. Theoretically faster - binary format, no JSON
parsing
    >     >>>  > > > 2. Hardened, used for internal ES node to node
    > communications
    >     >>>  > > >
    >     >>>  > > > *Cons of Java API*
    >     >>>  > > >
    >     >>>  > > > 1. Benchmarks show it's not really that much faster.
    >     >>>  > > > 2. Backwards compatibility - Java API changes often.
    >     >>>  > > > 3. Upgrades more challenging - need to refactor
client code
    > for
    >     >>>  new
    >     >>>  > > and
    >     >>>  > > > deprecated features.
    >     >>>  > > > 4. Minor releases may contain breaking changes in
the Java
    > API
    >     >>>  > > > 5. Client and server *should* be on same JVM version
(not as
    >     >>>  > important
    >     >>>  > > > post 2.x, but still potentially necessary bc of
    > serialization
    >     >>>  > w/binary
    >     >>>  > > > format)
    >     >>>  > > > 6. Requires dependency on the entire elasticsearch
server in
    >     >>>  order
    >     >>>  > to
    >     >>>  > > > use the client. We end up shading jars.
    >     >>>  > > >
    >     >>>  > > > *Benefits of new REST API*
    >     >>>  > > >
    >     >>>  > > > 1. Upgrades
    >     >>>  > > > 1. Breaking changes only made in major releases
- "We are
    > very
    >     >>>  > > > careful with backwards compatibility on the REST
layer where
    >     >>>  > > breaking
    >     >>>  > > > changes are made only in major releases."
    >     >>>  > > > 2. "The REST interface is much more stable and can
be
    > upgraded
    >     >>>  > out
    >     >>>  > > of
    >     >>>  > > > step with the Elasticsearch cluster."
    >     >>>  > > > 2. REST client and server can be on different JVM's
    >     >>>  > > > 3. Dependencies for the low level client are very
slim. No
    > need
    >     >>>  for
    >     >>>  > > > shading.
    >     >>>  > > > 4. The RestHighLevelClient supports the same request
and
    > response
    >     >>>  > > > objects as the TransportClient
    >     >>>  > > > 5. Can be secured via HTTPS
    >     >>>  > > >
    >     >>>  > > > There are some additional benefits to the new API,
however
    > they
    >     >>>  depend
    >     >>>  > on
    >     >>>  > > > whether we choose to go with the high or low level
client.
    > More
    >     >>>  > comments
    >     >>>  > > > below.
    >     >>>  > > >
    >     >>>  > > > *Cons of new API*
    >     >>>  > > >
    >     >>>  > > > 1. Dependencies - The high level client still requires
the
    > full
    >     >>>  ES
    >     >>>  > > > dependency, though this will slim down in future
releases.
    >     >>>  > > >
    >     >>>  > > > *Other comments specific to Metron*
    >     >>>  > > >
    >     >>>  > > > There's a question of whether we should use the
low or high
    > level
    >     >>>  REST
    >     >>>  > > > client. The main differences between the two are
how they
    > handle lib
    >     >>>  > > > dependencies and marshaling/unmarshaling. The low
level
    > client
    >     >>>  cleans
    >     >>>  > up
    >     >>>  > > > the dependencies dramatically, whereas the high
level
    > client still
    >     >>>  > > requires
    >     >>>  > > > you to depend on elasticsearch core. On the other
hand, the
    > low
    >     >>>  level
    >     >>>  > > > client does no work to handle marshaling/unmarshaling
the
    >     >>>  > > > requests/responses from the HTTP calls while the
high level
    > client
    >     >>>  > > handles
    >     >>>  > > > this for you and exposes api-specific methods. The
high
    > level client
    >     >>>  > > > accepts the same request arguments as the TransportClient
    > and
    >     >>>  returns
    >     >>>  > the
    >     >>>  > > > same response objects. One more thing to note is
that the
    > low level
    >     >>>  > > client
    >     >>>  > > > claims to be compatible with all versions of ES
whereas the
    > high
    >     >>>  level
    >     >>>  > > > client appears to be only major version compatible.
    >     >>>  > > >
    >     >>>  > > > "The 5.6 client can communicate with any 5.6.x
    > Elasticsearch node.
    >     >>>  > > Previous
    >     >>>  > > > 5.x minor versions like 5.5.x, 5.4.x etc. are not
(fully)
    >     >>>  supported."
    >     >>>  > [2]
    >     >>>  > > >
    >     >>>  > > > Just as an example, here's a simple comparison of
an index
    > request
    >     >>>  in
    >     >>>  > the
    >     >>>  > > > low and high level API's.
    >     >>>  > > >
    >     >>>  > > > *Low Level*
    >     >>>  > > >
    >     >>>  > > > Map<String, String> params = Collections.emptyMap();
    >     >>>  > > > String jsonString = "{" +
    >     >>>  > > > "\"user\":\"kimchy\"," +
    >     >>>  > > > "\"postDate\":\"2013-01-30\"," +
    >     >>>  > > > "\"message\":\"trying out Elasticsearch\"" +
    >     >>>  > > > "}";
    >     >>>  > > > HttpEntity entity = new NStringEntity(jsonString,
    >     >>>  > > > ContentType.APPLICATION_JSON);
    >     >>>  > > > Response response = restClient.performRequest("PUT",
    >     >>>  "/posts/doc/1",
    >     >>>  > > > params, entity);
    >     >>>  > > >
    >     >>>  > > > *High Level*
    >     >>>  > > >
    >     >>>  > > > IndexRequest indexRequest = new IndexRequest("posts",
    > "doc", "1")
    >     >>>  > > > .source("user", "kimchy",
    >     >>>  > > > "postDate", new Date(),
    >     >>>  > > > "message", "trying out Elasticsearch");
    >     >>>  > > >
    >     >>>  > > > *Note*: there are a few ways to do this with the
high level
    > API, but
    >     >>>  > this
    >     >>>  > > > was the most concise for me to offer a comparison
of
    > benefits over
    >     >>>  the
    >     >>>  > > low
    >     >>>  > > > level API.
    >     >>>  > > >
    >     >>>  > > > *Thoughts/Recommendations*: I do think we should
migrate to
    > the new
    >     >>>  > API.
    >     >>>  > > I
    >     >>>  > > > think the question is which of the new APIs we should
use.
    > The high
    >     >>>  > level
    >     >>>  > > > client seems to shield us from having to deal with
    > constructing
    >     >>>  special
    >     >>>  > > > JSON handling code, whereas the low level client
handles all
    >     >>>  versions
    >     >>>  > of
    >     >>>  > > > ES. I don't have a good feel (yet) for just how
much work
    > it would
    >     >>>  > > require
    >     >>>  > > > to use the low level API, or how difficult it would
be to
    > add new
    >     >>>  > request
    >     >>>  > > > features in the future. Actually, we could probably
leverage
    >     >>>  existing
    >     >>>  > > code
    >     >>>  > > > we have for dealing with JSON maps, so this might
be really
    > easy.
    >     >>>  > Someone
    >     >>>  > > > with more experience in Metron's ES client use might
have a
    > better
    >     >>>  idea
    >     >>>  > > of
    >     >>>  > > > the pros and cons to this. The high level client
appears to
    > handle
    >     >>>  > > > everything all JSON manipulation for us, but we
lose the
    > benefit of
    >     >>>  a
    >     >>>  > > > simpler dependency tree and support for all versions
of ES.
    > My only
    >     >>>  > > concern
    >     >>>  > > > with "supports all versions" is that I have to imagine
    > there are
    >     >>>  > specific
    >     >>>  > > > calls that we'd have to be careful of when constructing
the
    > JSON
    >     >>>  > > requests,
    >     >>>  > > > so it's unclear to me if this is better or worse
in the end.
    >     >>>  > > >
    >     >>>  > > > Best,
    >     >>>  > > > Mike
    >     >>>  > > >
    >     >>>  > > >
    >     >>>  > > > 1. https://www.elastic.co/blog/state-of-the-official-
    >     >>>  > > > elasticsearch-java-clients
    >     >>>  > > > 2. https://www.elastic.co/guide/
    > en/elasticsearch/client/java-
    >     >>>  > > > rest/current/java-rest-high-compatibility.html
    >     >>>  > > > <https://www.elastic.co/guide/en/elasticsearch/client/java-
    >     >>>  > > > rest/current/java-rest-high-compatibility.html>
    >     >>>  > > >
    >     >>>  > > >
    >     >>>  > > >
    >     >>>  > > >
    >     >>>  > > > On Wed, Sep 27, 2017 at 8:03 PM, Michael Miklavcic
<
    >     >>>  > > > michael.miklavcic@gmail.com> wrote:
    >     >>>  > > >
    >     >>>  > > > > I am working on upgrading Elasticsearch and
Kibana. There
    > are
    >     >>>  quite a
    >     >>>  > > few
    >     >>>  > > > > changes involved with this vix. I believe I'm
mostly
    > finished with
    >     >>>  > the
    >     >>>  > > > > Ambari mpack side of things, however we currently
only
    > support one
    >     >>>  > > > version
    >     >>>  > > > > with no backwards compatibility. What is the
community's
    > thoughts
    >     >>>  on
    >     >>>  > > > this?
    >     >>>  > > > >
    >     >>>  > > > > Here is some work contributed to the community
that I'm
    >     >>>  referencing
    >     >>>  > > while
    >     >>>  > > > > working on this upgrade - https://github.com/apache/
    >     >>>  > > > metron/pull/619/files
    >     >>>  > > > >
    >     >>>  > > > > Best,
    >     >>>  > > > > Michael Miklavcic
    >     >>>  > > > >
    >     >>>  > > >
    >     >>>  > >
    >     >>>  >
    >
    >     -------------------
    >     Thank you,
    >
    >     James Sirota
    >     PMC- Apache Metron
    >     jsirota AT apache DOT org
    >
    >
    >
    >
    

Mime
View raw message