metron-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Miklavcic <michael.miklav...@gmail.com>
Subject Re: [DISCUSS] Pcap panel architecture
Date Tue, 08 May 2018 05:36:40 GMT
What order did you add the hadoop or yarn classpath? The "shaded" package
stands out to me in this name "org.apache.hadoop.hbase.*shaded*
.org.codehaus.jackson.jaxrs.JacksonJaxbJsonProvider." Maybe try adding
those packages earlier on the classpath.

I think that find command needs a "jar tvf", otherwise you're looking for a
class name in jar file names.

Have you tried shading the rest jar?

I'd also look at the classpath you get when running "yarn jar" to start the
existing pcap service, per the instructions in metron-api/README.md.


On Mon, May 7, 2018 at 3:28 PM, Ryan Merriman <merrimanr@gmail.com> wrote:

> To explore the idea of merging metron-api into metron-rest and running pcap
> queries inside our REST application, I created a simple test here:
> https://github.com/merrimanr/incubator-metron/tree/pcap-rest-test.  A
> summary of what's included:
>
>    - Added pcap as a dependency in the metron-rest pom.xml
>    - Added a pcap query controller endpoint at
>    http://node1:8082/swagger-ui.html#!/pcap-query-controller/queryUsingGET
>    - Added a pcap query service that runs a simple, hardcoded query
>
> Generate some pcap data using pycapa (
> https://github.com/apache/metron/tree/master/metron-sensors/pycapa) and
> the
> pcap topology (
> https://github.com/apache/metron/tree/master/metron-
> platform/metron-pcap-backend#starting-the-topology).
> After this initial setup there should be data in HDFS at
> "/apps/metron/pcap".  I believe this should be enough to exercise the
> issue.  Just hit the endpoint referenced above.  I tested this in an
> already running full dev by building and deploying the metron-rest jar.  I
> did not rebuild full dev with this change but I would still expect it to
> work.  Let me know if it doesn't.
>
> The first error I see when I hit this endpoint is:
>
> java.lang.NoClassDefFoundError:
> org/apache/hadoop/yarn/webapp/YarnJacksonJaxbJsonProvider.
>
> Here are the things I've tried so far:
>
>    - Run the REST application with the YARN jar command since this is how
>    all our other YARN/MR-related applications are started (metron-api,
> MAAS,
>    pcap query, etc).  I wouldn't expect this to work since we have runtime
>    dependencies on our shaded elasticsearch and parser jars and I'm not
> aware
>    of a way to add additional jars to the classpath with the YARN jar
> command
>    (is there a way?).  Either way I get this error:
>
> 18/05/04 19:49:56 WARN reflections.Reflections: could not create Dir using
> jarFile from url file:/usr/hdp/2.6.4.0-91/hadoop/lib/ojdbc6.jar. skipping.
> java.lang.NullPointerException
>
>
>    - I tried adding `yarn classpath` and `hadoop classpath` to the
>    classpath in /usr/metron/0.4.3/bin/metron-rest.sh (REST start
> script).  I
>    get this error:
>
> java.lang.ClassNotFoundException:
> org.apache.hadoop.hbase.shaded.org.codehaus.jackson.
> jaxrs.JacksonJaxbJsonProvider
>
>
>    - I searched for the class in the previous attempt but could not find it
>    in full dev:
>
> find / -name "*.jar" 2>/dev/null | xargs grep
> org/apache/hadoop/hbase/shaded/org/codehaus/jackson/
> jaxrs/JacksonJaxbJsonProvider
> 2>/dev/null
>
>
>    - Further up in the stack trace I see the error happens when initiating
>    the org.apache.hadoop.yarn.util.timeline.TimelineUtils class.  I tried
>    setting "yarn.timeline-service.enabled" in Ambari to false and then I
> get
>    this error:
>
> Unable to parse
> '/hdp/apps/${hdp.version}/mapreduce/mapreduce.tar.gz#mr-framework' as a
> URI, check the setting for mapreduce.application.framework.path
>
>
>    - I've tried adding different hadoop, hbase, yarn and mapreduce Maven
>    dependencies without any success
>       - hadoop-yarn-client
>       - hadoop-yarn-common
>       - hadoop-mapreduce-client-core
>       - hadoop-yarn-server-common
>       - hadoop-yarn-api
>       - hbase-server
>
> I will keep exploring other possible solutions.  Let me know if anyone has
> any ideas.
>
> On Mon, May 7, 2018 at 9:02 AM, Otto Fowler <ottobackwards@gmail.com>
> wrote:
>
> > I can imagine a new generic service(s) capability whose job ( pun
> intended
> > ) is to
> > abstract the submittal, tracking, and storage of results to yarn.
> >
> > It would be extended with storage providers, queue provider, possibly
> some
> > set of policies or rather strategies.
> >
> > The pcap ‘report’ would be a client to that service, the specializes the
> > service operation for the way we want pcap to work.
> >
> > We can then re-use the generic service for other long running yarn
> > things…..
> >
> >
> > On May 7, 2018 at 09:56:51, Otto Fowler (ottobackwards@gmail.com) wrote:
> >
> > RE: Tracking v. users
> >
> > The submittal and tracking can associate the submitter with the yarn job
> > and track that,
> > regardless of the yarn credentials.
> >
> > IE> if all submittals and monitoring are by the same yarn user ( Metron )
> > from a single or
> > co-operative set of services, that service can maintain the mapping.
> >
> >
> >
> > On May 7, 2018 at 09:39:52, Ryan Merriman (merrimanr@gmail.com) wrote:
> >
> > Otto, your use case makes sense to me. We'll have to think about how to
> > manage the user to job relationships. I'm assuming YARN jobs will be
> > submitted as the metron service user so YARN won't keep track of this for
> > us. Is that assumption correct? Do you have any ideas for doing that?
> >
> > Mike, I can start a feature branch and experiment with merging metron-api
> > into metron-rest. That should allow us to collaborate on any issues or
> > challenges. Also, can you expand on your idea to manage external
> > dependencies as a special module? That seems like a very attractive
> option
> > to me.
> >
> > On Fri, May 4, 2018 at 8:39 AM, Otto Fowler <ottobackwards@gmail.com>
> > wrote:
> >
> > > From my response on the other thread, but applicable to the backend
> > stuff:
> > >
> > > "The PCAP Query seems more like PCAP Report to me. You are generating a
> > > report based on parameters.
> > > That report is something that takes some time and external process to
> > > generate… ie you have to wait for it.
> > >
> > > I can almost imagine a flow where you:
> > >
> > > * Are in the AlertUI
> > > * Ask to generate a PCAP report based on some selected
> alerts/meta-alert,
> > > possibly picking from on or more report ‘templates’
> > > that have query options etc
> > > * The report request is ‘queued’, that is dispatched to be be
> > > executed/generated
> > > * You as a user have a ‘queue’ of your report results, and when the
> > report
> > > is done it is queued there
> > > * We ‘monitor’ the report/queue press through the yarn rest ( report
> > > info/meta has the yarn details )
> > > * You can select the report from your queue and view it either in a new
> > UI
> > > or custom component
> > > * You can then apply a different ‘view’ to the report or work with the
> > > report data
> > > * You can print / save etc
> > > * You can associate the report with the alerts ( again in the report
> info
> > > ) with…. a ‘case’ or ‘ticket’ or investigation something or other
> > >
> > >
> > > We can introduce extensibility into the report templates, report views
> (
> > > thinks that work with the json data of the report )
> > >
> > > Something like that.”
> > >
> > > Maybe we can do :
> > >
> > > template -> query parameters -> script => yarn info
> > > yarn info + query info + alert context + yarn status => report info ->
> > > stored in a user’s ‘report queue’
> > > report persistence added to report info
> > > metron-rest -> api to monitor the queue, read results ( page ), etc etc
> > >
> > >
> > > On May 4, 2018 at 09:23:39, Ryan Merriman (merrimanr@gmail.com) wrote:
> > >
> > > I started a separate thread on Pcap UI considerations and user
> > > requirements
> > > at Otto's request. This should help us keep these two related but
> > separate
> > > discussions focused.
> > >
> > > On Fri, May 4, 2018 at 7:19 AM, Michel Sumbul <michelsumbul@gmail.com>
> > > wrote:
> > >
> > > > Hello,
> > > >
> > > >
> > > >
> > > > (Youhouuu my first reply on this kind of mail chain^^)
> > > >
> > > >
> > > >
> > > > If I may, I would like to share my view on the following 3 points.
> > > >
> > > > - Backend:
> > > >
> > > > The current metron-api is totally seperate, it will be logic for me
> to
> > > have
> > > > it at the same place as the others rest api. Especially when more
> > > security
> > > > will be added, it will not be needed to do the job twice.
> > > > The current implementation send back a pcap object which still need
> to
> > > be
> > > > decoded. In the opensoc, the decoding was done with tshard on the
> > > frontend.
> > > > It will be good to have this decoding happening directly on the
> backend
> > > to
> > > > not create a load on frontend. An option will be to install tshark on
> > > the
> > > > rest server and to use to convert the pcap to xml and then to a json
> > > that
> > > > will be send to the frontend.
> > > >
> > > > I tried to start directly the map/reduce job to search over all the
> > pcap
> > > > data from the rest server and as Ryan mention it, we had trouble. I
> > will
> > > > try to find back the error.
> > > >
> > > > Then in the POC, what we tried is to use the pcap_query script and
> this
> > > > work fine. I just modified it that he sends back directly the job_id
> of
> > > > yarn and not waiting that the job is finished. Then it will allow the
> > UI
> > > > and the rest server to know what the status of the research by
> querying
> > > the
> > > > yarn rest api. This will allow the UI and the rest server to be async
> > > > without any blocking phase. What do you think about that?
> > > >
> > > >
> > > >
> > > > Having the job submitted directly from the code of the rest server
> will
> > > be
> > > > perfect, but it will need a lot of investigation I think (but I'm not
> > > the
> > > > expert so I might be completely wrong ^^).
> > > >
> > > > We know that the pcap_query scritp work fine so why not calling it?
> Is
> > > it
> > > > that bad? (maybe stupid question, but I really don’t see a lot of
> > > drawback)
> > > >
> > > >
> > > >
> > > > - Front end:
> > > >
> > > > Adding the the pcap search to the alert UI is, I think, the easiest
> way
> > > to
> > > > move forward. But indeed, it will then be the “Alert UI and
> pcapquery”.
> > > > Maybe the name of the UI should just change to something like
> > > “Monitoring &
> > > > Investigation UI” ?
> > > >
> > > >
> > > >
> > > > Is there any roadmap or plan for the different UI? I mean did you
> > > already
> > > > had discussion on how you see the ui evolving with the new feature
> that
> > > > will come in the future?
> > > >
> > > >
> > > >
> > > > - Microservices:
> > > >
> > > >
> > > >
> > > > What do you mean exactly by microservices? Is it to separate all the
> > > > features in different projects? Or something like having the
> different
> > > > components in container like kubernet? (again maybe stupid question,
> > but
> > > I
> > > > don’t clearly understand what you mean J )
> > > >
> > > >
> > > >
> > > > Michel
> > > >
> > >
> > >
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message