metron-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Sirota <jsir...@apache.org>
Subject Re: [DISCUSS] Upgrading Elasticsearch from 2.x to 5.x
Date Thu, 12 Oct 2017 01:10:19 GMT
I can't see it.  You probably want to link to a google drive

11.10.2017, 18:01, "Michael Miklavcic" <michael.miklavcic@gmail.com>:
> I attached a PDF - shows up on my end. Is that not coming through?
>
> On Wed, Oct 11, 2017 at 6:42 PM, Otto Fowler <ottobackwards@gmail.com>
> wrote:
>
>>  I think there is a missing attachment?
>>
>>  On October 11, 2017 at 20:22:33, Michael Miklavcic (
>>  michael.miklavcic@gmail.com) wrote:
>>
>>  For community reference, here is a class diagram that depicts our current
>>  Metron 0.4.1 dependencies, for both prod and test code, against the old ES
>>  client APIs along with an "after" diagram showing the world with the new
>>  client. Feedback welcome.
>>
>>  On Fri, Oct 6, 2017 at 8:13 AM, Casey Stella <cestella@gmail.com> wrote:
>>
>>>  Yeah, I agree with what Michael "fine whine" Miklavcic said; I'm in favor
>>>  of the high level client.
>>>
>>>  On Thu, Oct 5, 2017 at 3:35 PM, Michael Miklavcic <
>>>  michael.miklavcic@gmail.com> wrote:
>>>
>>>  > Justin, thanks for the feedback! I'm inclined to agree with you about
>>>  using
>>>  > the high level client. It's a bummer that we still need to do jar
>>>  shading,
>>>  > but I think that's a reasonable short term sacrifice considering the
>>>  other
>>>  > benefits. And they're angling towards slowly removing the ES core dep
>>>  over
>>>  > time anyhow so, like myself, this will get better with age.
>>>  >
>>>  > On Thu, Oct 5, 2017 at 12:40 PM, Justin Leet <justinjleet@gmail.com>
>>>  > wrote:
>>>  >
>>>  > > Do we intend on (or have interest in) supporting ES across major
>>>  version
>>>  > > for a given version of Metron? I'm not convinced it's worth the work
>>>  of
>>>  > > using the low level client.
>>>  > >
>>>  > > This really only seems useful for ES clusters that are being used
>>>  outside
>>>  > > Metron and need to be on a different ES major version. Is that a
use
>>>  case
>>>  > > we want/need to support? I'm willing to bet it's significantly more
>>>  work
>>>  > > and means we're modifying queries and even templates/mappings based
on
>>>  > what
>>>  > > ES version we're interacting with (e.g. meta alerts in 5.x can
>>>  exploit a
>>>  > > query param to not screw around with the mapping, but that param
>>>  doesn't
>>>  > > exist in 2.x). At that point, we're either back to writing for ES
2.x
>>>  or
>>>  > > writing for every version of ES.
>>>  > >
>>>  > > Unless that's something we have a demand for (or someone else
>>>  persuades
>>>  > me
>>>  > > otherwise), I'm in favor of using the high level client. It seems
>>>  like
>>>  > > it'd be easier to migrate to also, given the similarities API-wise
to
>>>  the
>>>  > > current client we're using.
>>>  > >
>>>  > > On Thu, Oct 5, 2017 at 1:52 PM, Michael Miklavcic <
>>>  > > michael.miklavcic@gmail.com> wrote:
>>>  > >
>>>  > > > I think it might help the discussion to share my impressions
of
>>>  looking
>>>  > > > over the new API recommendations from ES. I've summarized some
info
>>>  > > > provided by ES back in December 2016 regarding the reasons for
>>>  > switching
>>>  > > to
>>>  > > > a new client model. [1]
>>>  > > >
>>>  > > > *Summary points:*
>>>  > > >
>>>  > > > Pre-5.x had Java API - binary exchange format used for node-to-node
>>>  > > > communications.
>>>  > > > In 5.x a low level REST API was added. Now there's also a high
level
>>>  > REST
>>>  > > > client that handles request marshalling and response un-marshalling.
>>>  > > >
>>>  > > > *Benefits of existing Java API*
>>>  > > >
>>>  > > > 1. Theoretically faster - binary format, no JSON parsing
>>>  > > > 2. Hardened, used for internal ES node to node communications
>>>  > > >
>>>  > > > *Cons of Java API*
>>>  > > >
>>>  > > > 1. Benchmarks show it's not really that much faster.
>>>  > > > 2. Backwards compatibility - Java API changes often.
>>>  > > > 3. Upgrades more challenging - need to refactor client code
for
>>>  new
>>>  > > and
>>>  > > > deprecated features.
>>>  > > > 4. Minor releases may contain breaking changes in the Java API
>>>  > > > 5. Client and server *should* be on same JVM version (not as
>>>  > important
>>>  > > > post 2.x, but still potentially necessary bc of serialization
>>>  > w/binary
>>>  > > > format)
>>>  > > > 6. Requires dependency on the entire elasticsearch server in
>>>  order
>>>  > to
>>>  > > > use the client. We end up shading jars.
>>>  > > >
>>>  > > > *Benefits of new REST API*
>>>  > > >
>>>  > > > 1. Upgrades
>>>  > > > 1. Breaking changes only made in major releases - "We are very
>>>  > > > careful with backwards compatibility on the REST layer where
>>>  > > breaking
>>>  > > > changes are made only in major releases."
>>>  > > > 2. "The REST interface is much more stable and can be upgraded
>>>  > out
>>>  > > of
>>>  > > > step with the Elasticsearch cluster."
>>>  > > > 2. REST client and server can be on different JVM's
>>>  > > > 3. Dependencies for the low level client are very slim. No need
>>>  for
>>>  > > > shading.
>>>  > > > 4. The RestHighLevelClient supports the same request and response
>>>  > > > objects as the TransportClient
>>>  > > > 5. Can be secured via HTTPS
>>>  > > >
>>>  > > > There are some additional benefits to the new API, however they
>>>  depend
>>>  > on
>>>  > > > whether we choose to go with the high or low level client. More
>>>  > comments
>>>  > > > below.
>>>  > > >
>>>  > > > *Cons of new API*
>>>  > > >
>>>  > > > 1. Dependencies - The high level client still requires the full
>>>  ES
>>>  > > > dependency, though this will slim down in future releases.
>>>  > > >
>>>  > > > *Other comments specific to Metron*
>>>  > > >
>>>  > > > There's a question of whether we should use the low or high
level
>>>  REST
>>>  > > > client. The main differences between the two are how they handle
lib
>>>  > > > dependencies and marshaling/unmarshaling. The low level client
>>>  cleans
>>>  > up
>>>  > > > the dependencies dramatically, whereas the high level client
still
>>>  > > requires
>>>  > > > you to depend on elasticsearch core. On the other hand, the
low
>>>  level
>>>  > > > client does no work to handle marshaling/unmarshaling the
>>>  > > > requests/responses from the HTTP calls while the high level
client
>>>  > > handles
>>>  > > > this for you and exposes api-specific methods. The high level
client
>>>  > > > accepts the same request arguments as the TransportClient and
>>>  returns
>>>  > the
>>>  > > > same response objects. One more thing to note is that the low
level
>>>  > > client
>>>  > > > claims to be compatible with all versions of ES whereas the
high
>>>  level
>>>  > > > client appears to be only major version compatible.
>>>  > > >
>>>  > > > "The 5.6 client can communicate with any 5.6.x Elasticsearch
node.
>>>  > > Previous
>>>  > > > 5.x minor versions like 5.5.x, 5.4.x etc. are not (fully)
>>>  supported."
>>>  > [2]
>>>  > > >
>>>  > > > Just as an example, here's a simple comparison of an index request
>>>  in
>>>  > the
>>>  > > > low and high level API's.
>>>  > > >
>>>  > > > *Low Level*
>>>  > > >
>>>  > > > Map<String, String> params = Collections.emptyMap();
>>>  > > > String jsonString = "{" +
>>>  > > > "\"user\":\"kimchy\"," +
>>>  > > > "\"postDate\":\"2013-01-30\"," +
>>>  > > > "\"message\":\"trying out Elasticsearch\"" +
>>>  > > > "}";
>>>  > > > HttpEntity entity = new NStringEntity(jsonString,
>>>  > > > ContentType.APPLICATION_JSON);
>>>  > > > Response response = restClient.performRequest("PUT",
>>>  "/posts/doc/1",
>>>  > > > params, entity);
>>>  > > >
>>>  > > > *High Level*
>>>  > > >
>>>  > > > IndexRequest indexRequest = new IndexRequest("posts", "doc",
"1")
>>>  > > > .source("user", "kimchy",
>>>  > > > "postDate", new Date(),
>>>  > > > "message", "trying out Elasticsearch");
>>>  > > >
>>>  > > > *Note*: there are a few ways to do this with the high level
API, but
>>>  > this
>>>  > > > was the most concise for me to offer a comparison of benefits
over
>>>  the
>>>  > > low
>>>  > > > level API.
>>>  > > >
>>>  > > > *Thoughts/Recommendations*: I do think we should migrate to
the new
>>>  > API.
>>>  > > I
>>>  > > > think the question is which of the new APIs we should use. The
high
>>>  > level
>>>  > > > client seems to shield us from having to deal with constructing
>>>  special
>>>  > > > JSON handling code, whereas the low level client handles all
>>>  versions
>>>  > of
>>>  > > > ES. I don't have a good feel (yet) for just how much work it
would
>>>  > > require
>>>  > > > to use the low level API, or how difficult it would be to add
new
>>>  > request
>>>  > > > features in the future. Actually, we could probably leverage
>>>  existing
>>>  > > code
>>>  > > > we have for dealing with JSON maps, so this might be really
easy.
>>>  > Someone
>>>  > > > with more experience in Metron's ES client use might have a
better
>>>  idea
>>>  > > of
>>>  > > > the pros and cons to this. The high level client appears to
handle
>>>  > > > everything all JSON manipulation for us, but we lose the benefit
of
>>>  a
>>>  > > > simpler dependency tree and support for all versions of ES.
My only
>>>  > > concern
>>>  > > > with "supports all versions" is that I have to imagine there
are
>>>  > specific
>>>  > > > calls that we'd have to be careful of when constructing the
JSON
>>>  > > requests,
>>>  > > > so it's unclear to me if this is better or worse in the end.
>>>  > > >
>>>  > > > Best,
>>>  > > > Mike
>>>  > > >
>>>  > > >
>>>  > > > 1. https://www.elastic.co/blog/state-of-the-official-
>>>  > > > elasticsearch-java-clients
>>>  > > > 2. https://www.elastic.co/guide/en/elasticsearch/client/java-
>>>  > > > rest/current/java-rest-high-compatibility.html
>>>  > > > <https://www.elastic.co/guide/en/elasticsearch/client/java-
>>>  > > > rest/current/java-rest-high-compatibility.html>
>>>  > > >
>>>  > > >
>>>  > > >
>>>  > > >
>>>  > > > On Wed, Sep 27, 2017 at 8:03 PM, Michael Miklavcic <
>>>  > > > michael.miklavcic@gmail.com> wrote:
>>>  > > >
>>>  > > > > I am working on upgrading Elasticsearch and Kibana. There
are
>>>  quite a
>>>  > > few
>>>  > > > > changes involved with this vix. I believe I'm mostly finished
with
>>>  > the
>>>  > > > > Ambari mpack side of things, however we currently only
support one
>>>  > > > version
>>>  > > > > with no backwards compatibility. What is the community's
thoughts
>>>  on
>>>  > > > this?
>>>  > > > >
>>>  > > > > Here is some work contributed to the community that I'm
>>>  referencing
>>>  > > while
>>>  > > > > working on this upgrade - https://github.com/apache/
>>>  > > > metron/pull/619/files
>>>  > > > >
>>>  > > > > Best,
>>>  > > > > Michael Miklavcic
>>>  > > > >
>>>  > > >
>>>  > >
>>>  >

------------------- 
Thank you,

James Sirota
PMC- Apache Metron
jsirota AT apache DOT org

Mime
View raw message