eagle-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhao, Qingwen" <qingwz...@ebay.com.INVALID>
Subject Re: Architecture improvement discussion
Date Mon, 18 Feb 2019 09:48:11 GMT
Got it. I agree with your idea. 
I have used Prometheus for a while in another project, and it's very easy to use and maintain
it. 

Thanks,
Qingwen

On 2018/12/24, 1:54 PM, "Edward Zhang" <yonzhang2012@gmail.com> wrote:

    Qingwen,
    
    There is no new architecture yet, it is just a very initial discussion :-)
    
    HBase is mainly used for job performance monitoring where mapreduce
    job/task data are stored. As long as Eagle supports customized job
    performance monitoring, the storage has be there although our data model is
    pretty agnostic to storage implementation.
    
    For metrics, Eagle 0.5 actually does not store metrics and only process
    them in streaming mode. My suggestion is we use mature tools like
    Prometheus to store and visualize metrics while Eagle focuses on policy
    evaluation.
    
    Thanks
    Edward
    
    On Thu, Oct 25, 2018 at 3:00 AM Zhao Qingwen <qingwen220@gmail.com> wrote:
    
    > Hi Edward,
    >
    > In the new architecture, is the storage(hbase) taken off?
    > How do the adaptors store the data? for example, hadoop namenode metrics.
    >
    > Best Regards,
    > Qingwen Zhao | 赵晴雯
    >
    >
    >
    >
    >
    > Edward Zhang <yonzhang2012@apache.org> 于2018年10月12日周五 上午10:49写道:
    >
    > > Hi Eaglers,
    > >
    > > I would like to start some discussion about architecture improvement for
    > > Apache Eagle based on community experience and feedback. The improvement
    > is
    > > targeted for simplifying installation and development of Apache Eagle.
    > >
    > > Eagle's main responsibility is to report abnormality instantly by
    > applying
    > > policies on streaming data. Eagle consists of two major components,
    > Policy
    > > Engine and Adaptors. Policy Engine is a standalone application which
    > > provides REST API to manage policy lifecycle for different data sources
    > and
    > > provides runtime to evaluate policy on streaming data.  Adaptors are
    > those
    > > applications which fetch/process log/metrics from outside and send data
    > to
    > > policy engine for alerting purpose.
    > >
    > > But right now in Eagle code base, it is not clearly focusing on the two
    > > components. For example the current source code includes map/reduce
    > > job/task log retrieval/cleanup/analysis which is very useful but probably
    > > Eagle only needs the portion of data retrieval/cleanup part and so data
    > can
    > > be streamed into policy engine for alerting purpose. For job/task
    > analysis
    > > part, it can be maintained in other project.
    > >
    > > First let me list the main modules Eagle source code consists of.
    > > - eagle core
    > >     - policy engine (coordinator, runtime, and web)
    > >     - monitor application management
    > >     - eagle query framework - for querying time series data from hbase
    > > - eagle adaptors
    > >      - gc log fetch/processing and alerting
    > >      - metric fetch/processing and alerting, including name node, data
    > > node, hbase etc.
    > >      - jpm: job performance management.
    > >             - haoop yarn queue statistics fetch/processing
    > >             - hadoop mapreduce history job log processing
    > >             - hadoop mapreduce running job processing
    > >             - spark history job log processing
    > >             - spark running job processing
    > >             - jpm web application
    > >             - hadoop job analyzer
    > >        - security monitoring
    > >              - hdfs audit log fetch/processing
    > >              - hdfs auth log fetch/processing
    > >              - hbase audit log fetch/processing
    > >              - hive log fetch/processing
    > >              - maprfs audit log fetch/processing
    > >              - oozie audit log fetch/processing
    > >         - hadoop topology stats fetch/processing
    > > - eagle server
    > >
    > > It is very obvious that it is not scale for Eagle community to maintain
    > so
    > > large amount of monitoring adaptors especially when Hadoop/Spark versions
    > > are evolving pretty fast.
    > >
    > > My suggestion is Eagle ONLY focus on policy engine and some default
    > > important adaptors but remove/separate some unrelated functionalities.
    > For
    > > policy engine, it would be nice if it can run on popular streaming engine
    > > besides Apache Storm so that it can be easily deployed for community
    > users.
    > > For default important adaptors, I may suggest Eagle have ONLY HDFS audit
    > > log, Hadoop running job, Spark running job, HDFS namenode metrics etc.
    > For
    > > unrelated functionalities, we can either remove them from Eagle code base
    > > or separate them into standalone executable if that is still really
    > needed
    > > under Apache Eagle monitoring umbrella by community.
    > >
    > > So the proposed Eagle code base would be like:
    > > - policy engine
    > >      - coordinator
    > >      - runtime
    > >      - web
    > > - adaptors
    > >     - hdfs audit log
    > >     - Hadoop running job
    > >     - Spark running job
    > >     - HDFS namenode metrics
    > >     - Hadoop yarn queue metrics
    > > - extensions (some non default adaptors contributed by community)
    > > - executables (standalone executables which are legacy)
    > >
    > > It would be great if you can provide more feedback on this discussion.
    > >
    > > (By the way, I also had a lot of discussion with Hao, Chen, Eagle PMC
    > > member and core developer about this topic based on his experience of
    > > engaging Eagle users.)
    > >
    > > Thanks
    > > Edward
    > >
    >
    

Mime
View raw message