logging-log4j-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robin Coe <rcoe.java...@gmail.com>
Subject Re: Log multiplexing/demultiplexing (was: Fast and Furious)
Date Thu, 03 Mar 2016 01:31:34 GMT
Thanks for taking it off-topic.  I responded in that thread, because I
thought the input handlers could be extended to consume from multiple
sources and re-integrate a log event.  The processor as I've written it
will consume events from a file but wouldn't integrate a single event from
multiple sources.  I also thought the input stream handlers could be
extended to include binary.  I had thought of using thrift for that but
didn't see the use case.

You've summed it up nicely.  In the simplest case, it's a wrapper, so the
event timestamp, etc., comes from the processor while the message comes
from the stream.  However, given a utf8 serialized event from a compatible
emitter, the log processor will re-materialize the original as much as
possible.  Logj doesn't have an open api that would allow me to do that, so
the event is still a wrapped message, but the i tent is to pull data of
interest from the original and use the mdc as the structure to hold it.
Gives better analysis in elasticsearch, I've found.

In doing this work, I have found certain issues with failover, like that it
won't complete the initialization of a stateful appender if the appender is
offline when logj initializes.  Also had to jump through a few hoops to
avoid unnecessary object creation when creating an event.  Pretty closed
api in places.  But otherwise, it works really well.
On Mar 2, 2016 7:25 PM, "Remko Popma" <remko.popma@gmail.com> wrote:

> (Changing the subject since this should be a separate mail thread)
>
> So if I understand correctly, it's like your app has one or more threads
> that read from a stream, wrap that data in a LogEvent, and then use a log4j
> logger to log the event. So log4j acts as a multiplexer (combining multiple
> inputs into a single output). This allows you to use the existing appenders
> and other log4j functionality for persisting and/or distributing the data.
> (And you can demultiplex by having multiple appenders.)
>
> So essentially this is to solve the problem of log aggregation. This is an
> interesting problem, and currently log4j doesn't have anything in that
> space.
>
> By the way, failover is notoriously hard, and I'm not sure that Log4j's
> current failover mechanism covers all or even most corner cases. It is
> definitely a "best effort, no guarantees" kind of thing.
>
> On Thursday, 3 March 2016, Robin Coe <rcoe.javadev@gmail.com> wrote:
>
>> The scope was to create a bridge for non java apps to get access to
>> log4j2's functionality.  I like the configurable appenders, especially the
>> failover.  I liked the plugin API and that other projects were using it to
>> create custom appenders, like elasticsearch.  I don't like having to go
>> through something like logstash, which throws away events when its buffer
>> fills up, instead prefering to talk to the data sink.  I also like that
>> it's simple to multiplex events to multiple endpoints.  My expectation is
>> that most languages have variants of log4*, so should have appenders that
>> would emit utf8 in some form, allowing me to create something like chainsaw
>> but using streams, not files.
>>
>> There are a lot of streaming processors out there but I couldn't find any
>> that had recoverable failover.  Other solutions have a lot of complexity in
>> their deployments, too, so having a simple runtime that uses existing tech
>> to stream events was what I was looking for.  The stream handlers are also
>> intended to be a public API, so I wanted to provide extensible event
>> handlers that could be used to vet event payloads, for security/data
>> integrity.
>> On Mar 2, 2016 5:52 PM, "Remko Popma" <remko.popma@gmail.com> wrote:
>>
>>> Robin, two questions:
>>> First, what is the problem you're trying to solve?
>>> Second, I'm having trouble understanding how your ideas fit in the
>>> log4j design. Do you intend to feed info *into* log4j (something like
>>> custom messages) or process info *coming out* of log4j (like a custom
>>> appender)? Or both? (But if both, why use log4j?)
>>>
>>> On Thursday, 3 March 2016, Robin Coe <rcoe.javadev@gmail.com> wrote:
>>>
>>>> Idea is a lightweight service that starts TCP listeners that consume
>>>> streams and parses them according to a layout, e.g., syslog, pipe, regex,
>>>> etc.  The configuration is via yaml, whereby a listener is coupled to a
>>>> codec.  The codec is the input stream layout coupled to the output stream
>>>> log4j2 route.
>>>>
>>>> Simplest use case is taking a stream, say stdout (12factor
>>>> architecture), and coupling that to a log4j2 route.  Other possibilities
>>>> are to consume json and create log event data structures from the
>>>> document.  By extension, any UTF8 stream could be parsed with regex, fields
>>>> of interest extracted and injecged into a log event, and passed to log4j2
>>>> to route.  The log4j2 routes I have set up use a failover strategy, whereby
>>>> upstream sources that go offline cause log events to be written to a local
>>>> file in json format.  On service recovery, I dispatch a worker that
>>>> rematerializes those events and sends them upstream.  Recovery begins with
>>>> a rollover to a file which uses naming by convention, using the route, the
>>>> worker parses the file and sends the events up.  This gives me best-effort
>>>> eventual consistency of log events in one or more upstream sinks.
>>>> Obviously, this implies a client agent.
>>>>
>>>> My original architecture was based on Kafka for
>>>> consistency/durability/performance but I couldn't guarantee delivery of
>>>> events from the emitter to the sink.  When I saw that log4j2 had failover,
>>>> I came up with this solution.  I just had to build the healthcheck service
>>>> and recovery worker.  My plan was to follow the log4j2 plugin architecture
>>>> and use annotations to declare log event handlers, allowing extension to
>>>> the log processor.  That part's not done.
>>>>
>>>>
>>>> ‚Äč
>>>>
>>>

Mime
View raw message