metron-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Justin Leet <>
Subject Re: [DISCUSS] Upgrading Elasticsearch from 2.x to 5.x
Date Thu, 05 Oct 2017 18:40:25 GMT
Do we intend on (or have interest in) supporting ES across major version
for a given version of Metron?  I'm not convinced it's worth the work of
using the low level client.

This really only seems useful for ES clusters that are being used outside
Metron and need to be on a different ES major version. Is that a use case
we want/need to support? I'm willing to bet it's significantly more work
and means we're modifying queries and even templates/mappings based on what
ES version we're interacting with (e.g. meta alerts in 5.x can exploit a
query param to not screw around with the mapping, but that param doesn't
exist in 2.x). At that point, we're either back to writing for ES 2.x or
writing for every version of ES.

Unless that's something we have a demand for (or someone else persuades me
otherwise), I'm in favor of using the high level client.  It seems like
it'd be easier to migrate to also, given the similarities API-wise to the
current client we're using.

On Thu, Oct 5, 2017 at 1:52 PM, Michael Miklavcic <> wrote:

> I think it might help the discussion to share my impressions of looking
> over the new API recommendations from ES. I've summarized some info
> provided by ES back in December 2016 regarding the reasons for switching to
> a new client model. [1]
> *Summary points:*
> Pre-5.x had Java API - binary exchange format used for node-to-node
> communications.
> In 5.x a low level REST API was added. Now there's also a high level REST
> client that handles request marshalling and response un-marshalling.
> *Benefits of existing Java API*
>    1. Theoretically faster - binary format, no JSON parsing
>    2. Hardened, used for internal ES node to node communications
> *Cons of Java API*
>    1. Benchmarks show it's not really that much faster.
>    2. Backwards compatibility - Java API changes often.
>    3. Upgrades more challenging - need to refactor client code for new and
>    deprecated features.
>    4. Minor releases may contain breaking changes in the Java API
>    5. Client and server *should* be on same JVM version (not as important
>    post 2.x, but still potentially necessary bc of serialization w/binary
>    format)
>    6. Requires dependency on the entire elasticsearch server in order to
>    use the client. We end up shading jars.
> *Benefits of new REST API*
>    1. Upgrades
>       1. Breaking changes only made in major releases - "We are very
>       careful with backwards compatibility on the REST layer where breaking
>       changes are made only in major releases."
>       2. "The REST interface is much more stable and can be upgraded out of
>       step with the Elasticsearch cluster."
>    2. REST client and server can be on different JVM's
>    3. Dependencies for the low level client are very slim. No need for
>    shading.
>    4. The RestHighLevelClient supports the same request and response
>    objects as the TransportClient
>    5. Can be secured via HTTPS
> There are some additional benefits to the new API, however they depend on
> whether we choose to go with the high or low level client. More comments
> below.
> *Cons of new API*
>    1. Dependencies - The high level client still requires the full ES
>    dependency, though this will slim down in future releases.
> *Other comments specific to Metron*
> There's a question of whether we should use the low or high level REST
> client. The main differences between the two are how they handle lib
> dependencies and marshaling/unmarshaling. The low level client cleans up
> the dependencies dramatically, whereas the high level client still requires
> you to depend on elasticsearch core. On the other hand, the low level
> client does no work to handle marshaling/unmarshaling the
> requests/responses from the HTTP calls while the high level client handles
> this for you and exposes api-specific methods. The high level client
> accepts the same request arguments as the TransportClient and returns the
> same response objects. One more thing to note is that the low level client
> claims to be compatible with all versions of ES whereas the high level
> client appears to be only major version compatible.
> "The 5.6 client can communicate with any 5.6.x Elasticsearch node. Previous
> 5.x minor versions like 5.5.x, 5.4.x etc. are not (fully) supported." [2]
> Just as an example, here's a simple comparison of an index request in the
> low and high level API's.
> *Low Level*
> Map<String, String> params = Collections.emptyMap();
> String jsonString = "{" +
>             "\"user\":\"kimchy\"," +
>             "\"postDate\":\"2013-01-30\"," +
>             "\"message\":\"trying out Elasticsearch\"" +
>         "}";
> HttpEntity entity = new NStringEntity(jsonString,
> Response response = restClient.performRequest("PUT", "/posts/doc/1",
> params, entity);
> *High Level*
> IndexRequest indexRequest = new IndexRequest("posts", "doc", "1")
>         .source("user", "kimchy",
>                      "postDate", new Date(),
>                      "message", "trying out Elasticsearch");
> *Note*: there are a few ways to do this with the high level API, but this
> was the most concise for me to offer a comparison of benefits over the low
> level API.
> *Thoughts/Recommendations*: I do think we should migrate to the new API. I
> think the question is which of the new APIs we should use. The high level
> client seems to shield us from having to deal with constructing special
> JSON handling code, whereas the low level client handles all versions of
> ES. I don't have a good feel (yet) for just how much work it would require
> to use the low level API, or how difficult it would be to add new request
> features in the future. Actually, we could probably leverage existing code
> we have for dealing with JSON maps, so this might be really easy. Someone
> with more experience in Metron's ES client use might have a better idea of
> the pros and cons to this. The high level client appears to handle
> everything all JSON manipulation for us, but we lose the benefit of a
> simpler dependency tree and support for all versions of ES. My only concern
> with "supports all versions" is that I have to imagine there are specific
> calls that we'd have to be careful of when constructing the JSON requests,
> so it's unclear to me if this is better or worse in the end.
> Best,
> Mike
>    1.
>    elasticsearch-java-clients
>    2.
>    rest/current/java-rest-high-compatibility.html
>    <
> rest/current/java-rest-high-compatibility.html>
> On Wed, Sep 27, 2017 at 8:03 PM, Michael Miklavcic <
>> wrote:
> > I am working on upgrading Elasticsearch and Kibana. There are quite a few
> > changes involved with this vix. I believe I'm mostly finished with the
> > Ambari mpack side of things, however we currently only support one
> version
> > with no backwards compatibility. What is the community's thoughts on
> this?
> >
> > Here is some work contributed to the community that I'm referencing while
> > working on this upgrade -
> metron/pull/619/files
> >
> > Best,
> > Michael Miklavcic
> >

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message