gora-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lewis John McGibbney (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (GORA-365) Where does Gora fit in YARN?
Date Tue, 19 Aug 2014 03:06:19 GMT

    [ https://issues.apache.org/jira/browse/GORA-365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101776#comment-14101776

Lewis John McGibbney commented on GORA-365:

Well what I am thinking here is that with regards to streaming data we can
consider the following scenarios.

Data modeling
Ad-hoc modeling where we do not know the structure of the data before we
receive it. This requires us to learn the structure as best as we can. This
is what I was describing previously.
Pre-determined modeling where we know the structure of the data and biuld a
data model before we receive the data stream. This wouldn't require the
current Gora data definition/modeling workflow to change. Everything would
stay the same.

Also consider the nature of streaming data.

Continuous, where latency between receiving data is so small or negligible
that we consider the stream continuous, or
Frequent but packet oriented, where we know the frequency is inconsistent
and that we need to always wait for data but that it is not continuously
streaming at us. In this case we can populate objects with data based on
the chunks we receive at each frequency interval.

Open questions here

On the first point, how can we recognize the structure and type of data
then build a data model around this?

What do we do with the 'burn in' data e.g. The initial data we use to learn
from that we maybe have a requirement to alsopersist? Say the data is
mission critical or indeed very important. Do we revisit it once we've
determined a suitable data model?

I am thinking that what I am describing here could possibly become a
component/value add for Storm or some other streaming framework. I don't
think that this is part of Gora. Gora would however be deployed (as
indicted in your diagram) as a component part of Storm similar to what
Renato did with Giraph.

These are some of my current thoughts.

On Friday, August 15, 2014, Alfonso Nishikawa (JIRA) <jira@apache.org>


> Where does Gora fit in YARN?
> ----------------------------
>                 Key: GORA-365
>                 URL: https://issues.apache.org/jira/browse/GORA-365
>             Project: Apache Gora
>          Issue Type: Task
>            Reporter: Alfonso Nishikawa
>            Priority: Trivial
>              Labels: discussion, yarn
>             Fix For: 0.6
>         Attachments: aplicaciones_y_datos.png, gora_example_diagram.png
> Question from Lewis:
> {quote}
> Hi Folks,
> Based on this diagram of YARN overview and where 'everything' plugs
> together.
> !http://tm.durusau.net/wp-content/uploads/2013/06/YARN2.png|width=600!
> I have a question... where does Gora fit in here? I have arguments in my
> head for many different places where Gora fits in. But the purpose of this
> thread is to try and discover from you guys, where you think Gota fits in
> for what you are doing (that is of course is your architecture looks
> anything like the picture I've posted).
> I hope that this thread can be a point of discussion as well as a potential
> opportunity to define apotential roadmap for Gora post 0.5 release (which I
> would like to push very soon).
> Thanks
> Lewis
> {quote}
> You can [read it at mailing list archive|https://mail-archives.apache.org/mod_mbox/gora-dev/201408.mbox/%3CCAGaRif3k9Bkc%2B3QFo6O9Xkr2XN_RHndK-FyfpFmP%3D%2BOCvmA9-A%40mail.gmail.com%3E].
> Issue for discussion ideas.

This message was sent by Atlassian JIRA

View raw message