eagle-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edward Zhang <yonzhang2...@apache.org>
Subject Architecture improvement discussion
Date Fri, 12 Oct 2018 02:49:45 GMT
Hi Eaglers,

I would like to start some discussion about architecture improvement for
Apache Eagle based on community experience and feedback. The improvement is
targeted for simplifying installation and development of Apache Eagle.

Eagle's main responsibility is to report abnormality instantly by applying
policies on streaming data. Eagle consists of two major components, Policy
Engine and Adaptors. Policy Engine is a standalone application which
provides REST API to manage policy lifecycle for different data sources and
provides runtime to evaluate policy on streaming data.  Adaptors are those
applications which fetch/process log/metrics from outside and send data to
policy engine for alerting purpose.

But right now in Eagle code base, it is not clearly focusing on the two
components. For example the current source code includes map/reduce
job/task log retrieval/cleanup/analysis which is very useful but probably
Eagle only needs the portion of data retrieval/cleanup part and so data can
be streamed into policy engine for alerting purpose. For job/task analysis
part, it can be maintained in other project.

First let me list the main modules Eagle source code consists of.
- eagle core
    - policy engine (coordinator, runtime, and web)
    - monitor application management
    - eagle query framework - for querying time series data from hbase
- eagle adaptors
     - gc log fetch/processing and alerting
     - metric fetch/processing and alerting, including name node, data
node, hbase etc.
     - jpm: job performance management.
            - haoop yarn queue statistics fetch/processing
            - hadoop mapreduce history job log processing
            - hadoop mapreduce running job processing
            - spark history job log processing
            - spark running job processing
            - jpm web application
            - hadoop job analyzer
       - security monitoring
             - hdfs audit log fetch/processing
             - hdfs auth log fetch/processing
             - hbase audit log fetch/processing
             - hive log fetch/processing
             - maprfs audit log fetch/processing
             - oozie audit log fetch/processing
        - hadoop topology stats fetch/processing
- eagle server

It is very obvious that it is not scale for Eagle community to maintain so
large amount of monitoring adaptors especially when Hadoop/Spark versions
are evolving pretty fast.

My suggestion is Eagle ONLY focus on policy engine and some default
important adaptors but remove/separate some unrelated functionalities. For
policy engine, it would be nice if it can run on popular streaming engine
besides Apache Storm so that it can be easily deployed for community users.
For default important adaptors, I may suggest Eagle have ONLY HDFS audit
log, Hadoop running job, Spark running job, HDFS namenode metrics etc. For
unrelated functionalities, we can either remove them from Eagle code base
or separate them into standalone executable if that is still really needed
under Apache Eagle monitoring umbrella by community.

So the proposed Eagle code base would be like:
- policy engine
     - coordinator
     - runtime
     - web
- adaptors
    - hdfs audit log
    - Hadoop running job
    - Spark running job
    - HDFS namenode metrics
    - Hadoop yarn queue metrics
- extensions (some non default adaptors contributed by community)
- executables (standalone executables which are legacy)

It would be great if you can provide more feedback on this discussion.

(By the way, I also had a lot of discussion with Hao, Chen, Eagle PMC
member and core developer about this topic based on his experience of
engaging Eagle users.)

Thanks
Edward

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message